Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
65 Cards in this Set
- Front
- Back
4 main assumptions of linear model and their assumtpions
|
1. Fixed X
2. Sum of errors=0 leads to unbiasedness 3. Homoscdeastacity 4. No Autocorrellation makes estimators efficient |
|
5th assumption of linear model
|
errors are normally distributed
|
|
Solution for OLS in Linear Algebra
|
B=(X'X)^-1(X'Y)
|
|
What does SER do
|
Best overall indicator of regression effectiveness
|
|
Blue
|
best linear unbiased estimator
|
|
What does GOF tell us
|
How well the model fits the data
|
|
Variance
|
avg deviation from the mean -> how spread out the distribution is
|
|
Central Tendency ->E(X)
|
the mean, median, and mode
|
|
Dispersion
|
range, variation, and SD -> distribution about the mean
|
|
Covariance
|
measure of how two variables vary together
|
|
correllation
|
standard covariance
|
|
standard error
|
measure of accuracy -> standard deviation of the sampling distribution of that statistic
|
|
probability
|
measure of uncertainty
|
|
Random variable
|
assigns outcome to event
|
|
mutually exclusive
|
when one thing happens another cannot
|
|
conditional probability
|
ratio of joint to marginal
|
|
parameters
|
defines what distributions look like
|
|
2 ways to define distributions
|
1. probability distrabution function
2. cumlative density function |
|
Central limit theory
|
bigger the scale the better the data
|
|
3 hypothesis tests
|
1. compare mean to some hypothetical value
2. standardize its deviation 3. use properties of normal curve |
|
hypothesis
|
statement that something is true
|
|
Null
|
a hypothesis to be tested
|
|
Point estimate
|
value of a statistic used to estimate a parameter
|
|
P value requirement
|
5 or below to reject null
|
|
T-test requirement
|
outside of +- 2 standard deviations reject null
|
|
T value
|
number of SD's away from the mean
|
|
Identity Matrix
|
matrix with 1's alone the diagnol
|
|
Regression Analysis
|
process of estimating parameters from samlpe data
|
|
Residuals
|
difference between regression and actual results
|
|
RSS
|
Sum of squared residuals
|
|
4 measure of GOF
|
1. standard error of regression
2. test on each individual slope coeficient 3. R^2 4. F-test |
|
Total variance
|
absence of any other info other then mean
|
|
Two parts of TSS
|
1. ESS
2. RSS |
|
ESS
|
Error sum of Squares
|
|
What is ESS
|
variation in Y not accounted for by X
|
|
RSS
|
difference between regression line an mean
|
|
What does a large RSS tell us
|
means more Y variation explained by X
|
|
R^2
|
coefficient of determination
|
|
What does a large RSS with respect to TSS tell us
|
how much variation of Y is explained by X
|
|
OLS
|
ordinary least squares
|
|
What is OLS
|
process by which we turn data into theoretical quantities
|
|
Diagnostics
|
warn us of inappropiate uses of OLS
|
|
Problems with data to focus on
|
1. unusual data
2. non-constant variation 3. non-normal errors |
|
b1
|
b1=sum((xi-meanx)(yi-meany))/sum(xi-meanx)^2
|
|
b0
|
b0=Y-b1(meanx)
|
|
Null hypothesis
|
no difference between hypothesis and reality...where you fail to reject null
|
|
skew
|
where the tail in a set of data is elongated
|
|
3 units of linear algebra analysis
|
1. scalar
2. vector 3. matix |
|
singularity
|
when one column is a linear function of another -> when variables are perfectly correlated
|
|
Linear model
|
tells us the trend between variables
|
|
Regression analysis
|
process of estimating parameters from sample data
|
|
Regression coefficients
|
b0 and b1
|
|
Residuals
|
difference between regression and actual results
|
|
P(Y|X)
|
probability distribution of Y for specific values of x...probability of seeing value of Y conditional on X
|
|
OLS
|
process by which we turn data into theoretical quantities
|
|
Influence
|
Leverage * Discrepancy
|
|
Outlier
|
ususual Y for level of X -> does not mean seperate from data cloud
|
|
Method of deletion
|
1. remove outlier
2. calculate line 3. calculate residual as if influential case had been present |
|
How do you fix heteroscedascity
|
transform the Y variable
|
|
Leverage
|
how much influence does a point Y exert on all fitted fields
|
|
discrepancy
|
how far from regression is outlier
|
|
4 elements of modeling
|
1. question
2. DV 3. IV 4. Unit of analysis |
|
Bias
|
the amount of error that arises when estimating a quantity
|
|
Complementarity
|
probability of something not happening -> 1-P
|
|
heteroscedascity
|
constant error variance
|