Study your flashcards anywhere!
Download the official Cram app for free >
 Shuffle
Toggle OnToggle Off
 Alphabetize
Toggle OnToggle Off
 Front First
Toggle OnToggle Off
 Both Sides
Toggle OnToggle Off
 Read
Toggle OnToggle Off
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
65 Cards in this Set
 Front
 Back
4 main assumptions of linear model and their assumtpions

1. Fixed X
2. Sum of errors=0 leads to unbiasedness 3. Homoscdeastacity 4. No Autocorrellation makes estimators efficient 

5th assumption of linear model

errors are normally distributed


Solution for OLS in Linear Algebra

B=(X'X)^1(X'Y)


What does SER do

Best overall indicator of regression effectiveness


Blue

best linear unbiased estimator


What does GOF tell us

How well the model fits the data


Variance

avg deviation from the mean > how spread out the distribution is


Central Tendency >E(X)

the mean, median, and mode


Dispersion

range, variation, and SD > distribution about the mean


Covariance

measure of how two variables vary together


correllation

standard covariance


standard error

measure of accuracy > standard deviation of the sampling distribution of that statistic


probability

measure of uncertainty


Random variable

assigns outcome to event


mutually exclusive

when one thing happens another cannot


conditional probability

ratio of joint to marginal


parameters

defines what distributions look like


2 ways to define distributions

1. probability distrabution function
2. cumlative density function 

Central limit theory

bigger the scale the better the data


3 hypothesis tests

1. compare mean to some hypothetical value
2. standardize its deviation 3. use properties of normal curve 

hypothesis

statement that something is true


Null

a hypothesis to be tested


Point estimate

value of a statistic used to estimate a parameter


P value requirement

5 or below to reject null


Ttest requirement

outside of + 2 standard deviations reject null


T value

number of SD's away from the mean


Identity Matrix

matrix with 1's alone the diagnol


Regression Analysis

process of estimating parameters from samlpe data


Residuals

difference between regression and actual results


RSS

Sum of squared residuals


4 measure of GOF

1. standard error of regression
2. test on each individual slope coeficient 3. R^2 4. Ftest 

Total variance

absence of any other info other then mean


Two parts of TSS

1. ESS
2. RSS 

ESS

Error sum of Squares


What is ESS

variation in Y not accounted for by X


RSS

difference between regression line an mean


What does a large RSS tell us

means more Y variation explained by X


R^2

coefficient of determination


What does a large RSS with respect to TSS tell us

how much variation of Y is explained by X


OLS

ordinary least squares


What is OLS

process by which we turn data into theoretical quantities


Diagnostics

warn us of inappropiate uses of OLS


Problems with data to focus on

1. unusual data
2. nonconstant variation 3. nonnormal errors 

b1

b1=sum((ximeanx)(yimeany))/sum(ximeanx)^2


b0

b0=Yb1(meanx)


Null hypothesis

no difference between hypothesis and reality...where you fail to reject null


skew

where the tail in a set of data is elongated


3 units of linear algebra analysis

1. scalar
2. vector 3. matix 

singularity

when one column is a linear function of another > when variables are perfectly correlated


Linear model

tells us the trend between variables


Regression analysis

process of estimating parameters from sample data


Regression coefficients

b0 and b1


Residuals

difference between regression and actual results


P(YX)

probability distribution of Y for specific values of x...probability of seeing value of Y conditional on X


OLS

process by which we turn data into theoretical quantities


Influence

Leverage * Discrepancy


Outlier

ususual Y for level of X > does not mean seperate from data cloud


Method of deletion

1. remove outlier
2. calculate line 3. calculate residual as if influential case had been present 

How do you fix heteroscedascity

transform the Y variable


Leverage

how much influence does a point Y exert on all fitted fields


discrepancy

how far from regression is outlier


4 elements of modeling

1. question
2. DV 3. IV 4. Unit of analysis 

Bias

the amount of error that arises when estimating a quantity


Complementarity

probability of something not happening > 1P


heteroscedascity

constant error variance
