Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

image

Play button

image

Play button

image

Progress

1/65

Click to flip

65 Cards in this Set

  • Front
  • Back
4 main assumptions of linear model and their assumtpions
1. Fixed X
2. Sum of errors=0
leads to unbiasedness
3. Homoscdeastacity
4. No Autocorrellation
makes estimators efficient
5th assumption of linear model
errors are normally distributed
Solution for OLS in Linear Algebra
B=(X'X)^-1(X'Y)
What does SER do
Best overall indicator of regression effectiveness
Blue
best linear unbiased estimator
What does GOF tell us
How well the model fits the data
Variance
avg deviation from the mean -> how spread out the distribution is
Central Tendency ->E(X)
the mean, median, and mode
Dispersion
range, variation, and SD -> distribution about the mean
Covariance
measure of how two variables vary together
correllation
standard covariance
standard error
measure of accuracy -> standard deviation of the sampling distribution of that statistic
probability
measure of uncertainty
Random variable
assigns outcome to event
mutually exclusive
when one thing happens another cannot
conditional probability
ratio of joint to marginal
parameters
defines what distributions look like
2 ways to define distributions
1. probability distrabution function
2. cumlative density function
Central limit theory
bigger the scale the better the data
3 hypothesis tests
1. compare mean to some hypothetical value
2. standardize its deviation
3. use properties of normal curve
hypothesis
statement that something is true
Null
a hypothesis to be tested
Point estimate
value of a statistic used to estimate a parameter
P value requirement
5 or below to reject null
T-test requirement
outside of +- 2 standard deviations reject null
T value
number of SD's away from the mean
Identity Matrix
matrix with 1's alone the diagnol
Regression Analysis
process of estimating parameters from samlpe data
Residuals
difference between regression and actual results
RSS
Sum of squared residuals
4 measure of GOF
1. standard error of regression
2. test on each individual slope coeficient
3. R^2
4. F-test
Total variance
absence of any other info other then mean
Two parts of TSS
1. ESS
2. RSS
ESS
Error sum of Squares
What is ESS
variation in Y not accounted for by X
RSS
difference between regression line an mean
What does a large RSS tell us
means more Y variation explained by X
R^2
coefficient of determination
What does a large RSS with respect to TSS tell us
how much variation of Y is explained by X
OLS
ordinary least squares
What is OLS
process by which we turn data into theoretical quantities
Diagnostics
warn us of inappropiate uses of OLS
Problems with data to focus on
1. unusual data
2. non-constant variation
3. non-normal errors
b1
b1=sum((xi-meanx)(yi-meany))/sum(xi-meanx)^2
b0
b0=Y-b1(meanx)
Null hypothesis
no difference between hypothesis and reality...where you fail to reject null
skew
where the tail in a set of data is elongated
3 units of linear algebra analysis
1. scalar
2. vector
3. matix
singularity
when one column is a linear function of another -> when variables are perfectly correlated
Linear model
tells us the trend between variables
Regression analysis
process of estimating parameters from sample data
Regression coefficients
b0 and b1
Residuals
difference between regression and actual results
P(Y|X)
probability distribution of Y for specific values of x...probability of seeing value of Y conditional on X
OLS
process by which we turn data into theoretical quantities
Influence
Leverage * Discrepancy
Outlier
ususual Y for level of X -> does not mean seperate from data cloud
Method of deletion
1. remove outlier
2. calculate line
3. calculate residual as if influential case had been present
How do you fix heteroscedascity
transform the Y variable
Leverage
how much influence does a point Y exert on all fitted fields
discrepancy
how far from regression is outlier
4 elements of modeling
1. question
2. DV
3. IV
4. Unit of analysis
Bias
the amount of error that arises when estimating a quantity
Complementarity
probability of something not happening -> 1-P
heteroscedascity
constant error variance