• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/63

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

63 Cards in this Set

  • Front
  • Back
Goal of multiple regression
Explain criterion variance
Applications of multiple regression
1. Assess independent effects of particular predictors controlling for others
2. Compare sets of variables (which combination leads to best model?)
3. Test causal models
4. Test nonlinear relations
Level of measurement required for regression
Criterion must be continuous
Predictors can be continuous, dichotomous, or categorical
Equations
Ŷ = a + B1X1+B2X2
zy=β1(zx1)+β2(zx2)
R^2
proportion of DV's variance shared with IV's
Sums of squares total
total variability in system
SS total = ∑(y-Mean of y )^2
Sums of squares regression
variability accounted for by model
SS regression = ∑(ŷ-Mean of y )^2
Sums of squares residual
variability model could not account for
SS residual=∑(y-ŷ)^2
Partial correlations
Correlation between y and a given predictor (x1) from which all other predictors have been partialed from x1 and y
Zero-order correlations
bivariate correlation b/w x1 and y and x2 and y
Semi-partial/part correlations
correlation between y and a given predictor (x1) from which all other predictors have been partialed from x1. Relationship b/w all of y and variance unique to x1
Square semi-partials, variance in y that can be explained by unique predictors.
Adjusted R2
sampling error inflates R, must adjust for this bias (function of sampling error, number of predictors)
Standard error of estimate
magnitude of residuals. degree to which our model improved predictive accuracy, compared to SDy, lower = better
Beta Weights
control for other predictors/relationships in the system.
Standard error: SD, how much it varies across cases
Suppression
variable increases the predictive validity of another variable by its inclusion in the regression equation. X2 suppresses the variance in X1 that is irrelevant to Y. Partial regression coefficients are larger than the zero-order correlation.
Assumptions of multiple regression
•Form of relation is correctly specified (linear relationships)
•All relevant variables have been included as predictors
•All variables are measured without error
•Homoscedasticity
•Residuals are independent
•Residuals are normally distributed
Forward Entry
goes through and finds best predictor, i.e. x1, pulls out variance. Can I add any other predictor of the remaining to explain more variance? Continue until you can no longer explain more variance in y. Builds a model, add predictors as long as we are explaining significant additional variance.
•Output: shows step and which variables were entered at each step.
Backward Entry
put all in, remove those that explain least variance.
Stepwise entry
starts like forward, enter variable with most variance explained. At each step, variables evaluated for whether they should be entered or removed.
Approaches to Multiple regression: logical/rational
Hierarchical regression/forced entry: focus on change in R2. You decide order of entry based on prior literature/theory.
•Block 1: enter controls, block 2 enter other predictors. Ask for R2 change.
Mediation
x causes m causes y
Partial mediation
still a direct relationship b/w x and y
m and x have nonzero weights
Full mediation
no relationship b/w x and y except through m
m significant, x not
Steps in mediation
1. regress y onto x
2. regress m onto x
3. regress y onto x and m
sobel test
test if indirect path is statistically significant using weights and standard error
Moderation
z changes relationship between x and y (interaction)
Disordinal interaction
graph looks like: X
at one level of z, x and y positively related
at other level of z, x and y negatively related
Ordinal interaction
graph looks like: >
Steps in moderation
Step 1: Form product term (XZ)
Step 2: Generate regression equation in which y is regressed on x and z
Step 3: Add the product term to the regression equation and evaluate significance
Step 4: Plot interactions
Centering
make zero a meaningful value on that variable. Most of our data do not have meaningful zero, coefficients difficult to interpret.
Ways to center:
x-x ̅, maintain original units. Doesn’t matter if you’re only interested in significance
Convert to z scores, z ̅=0
Using continuous predictors
Pick value on low and high end of continuous variable and plug into regression equation to plot line, can pick any value because variable infinite, usually pick ±1 SD.
Simple slopes
predict Y from X at a particular level of Z. Is slope of line different from zero? Hold moderator variable constant at a particular level, is regression equation significant?

Ŷ = B0 + (B1 +B3Z)X +B2Z
Polytomous categorical predictors
Create more than one dummy variable to account for multiple levels (number of groups - 1)
Put in all dummy variables as a set
Significance of predictors is in comparison to referent group (is mean significantly different from referent group mean?)
What happens if we change the codes for categorical variables
CODES DON'T MATTER! R value and model summary will be the same, only the regression weights change.
Effects coding: unweighted
Use when no obvious referent group

Compare each group to overall mean
Same as dummy coding, except code choose one group to code -1 in each dummy variable. No test to compare this group to mean.
Constant = unweighted grand mean
Effects coding: weighted
No reference group but want to give weight to groups based on sample size, use when sample representative of population. Compare to overall weighted mean.

Code one group:
Denominator: sample size of x4
Numerator: sample size of x1
Contrast coding
Use when we want to test specific hypotheses

C1: x1 = 1/3, x2 = 1/3, x3 = 0, x4 = -2/3
C2: x1 = -1/2, x2 = 1/2, x3 = 0, x4 = 0
C3: x1 = -1/4, x2 = -1/4, x3 = 3/4, x4 = -1/4

Constant = unweighted grand mean
2 rules for contrast coding
•Codes for each contrast must sum to zero
•Sum of products for each pair of contrasts must sum to zero
∑C1*C2 = 0

Ex. 1/3 + 1/3 + 0 + -2/3 = 0
(1/3)(-1/2) + (1/3)(1/2) + (0)(0) + (-2/3)(0) = 0
Polynomial regression
Step 1: y onto x, step 2: add x2, improves prediction
If we include higher order term, must include lower terms. Can include x3, significantly improves.
Partial regression coefficient
Change in Y per unit change in X1 when we partial out the effect of X2. Regression coefficient in multiple regression context.
Effects in mediation
Direct: c', x on y
Indirect: a*b, pathway through m
Total: c
Dummy codes
D1: x1 = 1, all others = 0
D2: x2 = 1, all others = 0
D3: x3 = 1, all others = 0

Significance of weights is in comparison to referent group
Bulging rule
Power down X if in left quadrants, power up X if in right quadrants
Discrepancy
Degree to which a case is in line with the swarm of points

Studentized residuals
Internally studentized residuals
ratio of a case's residual to the SD of the residual. Cases close to 0 are not discrepant, but no real method for identifying discrepancy
Externally studentized residuals
case is dropped, regression equation is re-estimated and applied to the hold out case. residual and SD are computed. distributed as t, use t-test for significance
Distance/Leverage
Conveys how far a given case is from the swarm of other cases

Mahalanobis Distance
Mahalanobis Distance
distributed as Chi-square, can look up critical value in table. Degrees of freedom = # of predictors. Go through variables look for cases that exceed critical value. Considers combination of variables.
Influence
Leverage X Discrepancy. Addresses the impact of removing the case on the resulting regression equation and coefficients.
Global Measures of Influence
Look at impact on overall model (R^2)

Cook's D
DFFit
Specific Measures of Influence
impact on individual regression weights

DFBeta
Outliers in the solution
look for outliers after analysis has been run, how does model fit for cases? casewise diagnostics
Multicollinearity
Highly correlated predictors are included in the analysis
Changes weights and increases standard error of weights
Predictions are less accurate and more unstable
Need to consider all variables simultaneously
How do we evaluate multicollinearity?
Treat X1 as DV, run regression with other predictors, interested in R2. If small, X1 is unique.
VIF
Tolerance
VIF
variable inflation factor. How large is the standard error of a particular coefficient in relation to what it would have been if the predictor was completely unrelated to the others? Rule of thumb is to define serious as VIF > 10
Tolerance
reciprocal of VIF, how much variance in a given predictor is independent of the others? Ranges from 0-1.0, 0 = no unique variance, 1.0 = totally independent. Rule of thumb is to define serious as tolerance < .10
Violations of relationship form
Plot residuals against each predictor. Visual evaluation of linearity (Loess lines). “Handwritten” regression line. Is it a generally linear relationship?
To fix: incorporate nonlinear terms
Missing predictors
can’t check for it. Know the literature. If you measured predictor, include it.
To fix: if you didn’t measure predictor, you’re screwed. If you did, include it. If R2 increases, model improves.
Violations: reliability
Use measures that have demonstrated reliability. Adds random error to system variance. Usually don’t worry about because unreliability makes it more difficult to find a relationship.
o To fix: can use equation to correct for unreliability, ignore it, or use SEM.
Heteroscedasticity
Visual approach: compare residuals to each of the predictors and predicted criterion. Should be no relationship, want variability to be equal around line. Concern if small and fans out.
Empirical approach: use residuals as DV, individual predictors as IVs and conduct t-tests. Or can split variable into groups (i.e. divide into 5 groups with 20% of data). Run an ANOVA. If Levene’s significant, means variance different in groups.
Rare to have heteroscedasticity. Rule of thumb: ratio of largest to smallest conditional variance > 10.
Weighted Least Squares: differentially weight cases based on variance of residuals. Cannot be interpreted the same as OLS.
Nonindependence of residuals
Visual: plot case by residual (case x, residual y) should be no relationship.
Empirical: Durbin-Watson test. Centered around z, look for extreme values away from 2 (0, 4). Is there a relationship between case and residuals? Worry about in studies in which cases are meaningfully organized, i.e. something at beginning of data different from end. In most data sets, case numbers are arbitrary.
To fix:
If related to clustering, create dummy variables to represent groups. Multilevel modeling procedures. Effect of higher level variables on lower level variables. Happens if variables are nested.
If related to serial dependency, transform data.
Checking if residuals are normally distributed
Visual: histogram, q-q plot (plot residuals), best approach
Empirical: Shapiro-Wilk test, conservative
R
Correlation between observed and predicted values of the DV