Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
39 Cards in this Set
- Front
- Back
Threats to causal relationships
|
Reverse causation, confounding variables
|
|
Confounding variables
|
Additional factors that might lead us to give too much “credit” to an explanatory variable.
|
|
What does the coefficient of X1 mean? (multivariate regression with X1 and X2 as variables)
|
This is the relationship between X1 and Y
- controlling for X2 - holding X2 constant - controlling for the linear relationship between X2 and Y |
|
What does the coefficient of the constant mean? (multivariate regression with X1 and X2 as variables)
|
This is the expected value for when both X1 AND X2 are 0.
|
|
R-squared
|
Measure of the proportion of the variance in the dependent variable explained by the independent variables
|
|
Calculate R-squared
|
(Variance of Y - Sum of Squared Residuals)/Variance of Y
|
|
When would you use an F-test?
|
- Nominal variables
- Variables likely to be highly correlated, but important predictors |
|
Unrestricted model
|
Includes independent variables you want to test joint significance of
|
|
Restricted model
|
Same model, excluding independent variables to be best tested
|
|
Interpret the coefficient on X1 (multivariate regression).
|
The coefficient on X1 indicates that - controlling for other variables in the model - _ (e.g. for every additional unit, Y changes by _...). Because the p-value is less than .05 we can be 95% confident that this relationship is different from zero (if significant).
|
|
Dichotomous dependent variable.
|
Probability or percentage, use PERCENTAGE POINTS when interpreting the coefficient
|
|
Why would the coefficient on X1 drop when you add X2 and X3 to the model?
|
X2 and X3 are related to X1 as well as Y. Failing to account for X2 and X3 would result in overestimating the effects of X1. Without adding X2 and X3, the relationship between X1 and Y is confounded by other factors that might affect Y.
|
|
When do you log an independent variable?
|
When the data is skewed
|
|
How do you confirm that a variable contributes to the predictive power of the model when you are using that variable AND its squared term?
|
Use an F-test to check for joint significance.
|
|
F test
|
A way to test whether adding a set of variables reduces the sum of squared residuals enough to justify throwing these new variables into the model.
|
|
What does the F test depend on?
|
- How much the SSR is reduced
- How many variables you are adding - How many cases you have to work with |
|
Why would you use nominal rather than ordered variables (e.g. party identification).
|
The change from one value to another might not be the same, e.g. from -3 to -2 and 0 to 1.
|
|
What does the coefficient on X1-squared tell us?
|
Including X1-squared and allowing for a curved relationship between X1 and Y improves the fit of the model (if significant).
|
|
Interpret the coefficient on X1*X2.
|
For every one unit increase in X1, the slope of the relationship between X2 and Y will (increase/decrease) by (coefficient).
AND for every one unit increase in X2, the slope on the relationship between X1 and Y will (increase/decrease) by (coefficient). |
|
What does it mean when the coefficient of X1*X2 is statistically significant?
|
This means that the slope of the relationship between X1 and Y is significantly different depending on the value of X2 (and vice versa).
|
|
Interpreting a graph of relationships involving an interaction term
|
- Is the slope greater for one group than the other (is the relationship stronger)
- Where is the difference between the two groups greatest? |
|
Interpret the coefficient on X1 (multivariate regression including an interaction term)
|
The slope of the relationship between X1 and Y = _ when X2 = 0 (and vice versa).
|
|
What does random assignment guarantee?
|
Treatment and control groups will be similar except for the fact that one group is “treated” - any change will be due to treatment
|
|
External validity
|
Would what you find actually happen outside of the context of an experiment?
|
|
Field experiment
|
Intervention done while people are going about their business. PRO: external validity
CON: no experimental control, ethical issues |
|
Pros/cons of observational analysis
|
PRO
- Can “find” data – don’t have to gather it yourself - Sometimes the only reasonable approach (e.g. what causes war?) CON - Difficult to definitively determine causation BUT if our model is good we can get good estimates |
|
Pros/cons of experiments
|
PRO
- Identifying causality - Eliminates confounding variables - Random assignment, experimental control - Detail - e.g. break down process into smaller units - Precise measurement CON - External validity (can results be replicated outside of the context of an experiment? - Unrepresentative subject pools (college students) - Experimenter bias (e.g. demand characteristics, expectancy effects) |
|
Experiment null and alternative hypotheses
|
H0: no difference between the treatment groups
HA: difference between the treatment groups |
|
Test for difference in proportions
|
Difference of proportions/Standard error of this difference
|
|
CLT in relation to difference in proportions
|
In repeated sampling, the distribution of our estimates of the mean (or difference of means or slope) will be normally distributed and centered over the true population value
|
|
Why might the standard error of a regression coefficient be small?
|
Large N
|
|
Find the T-value of a regression coefficient.
|
Coefficient/Standard Error
|
|
How would adding X2 affect the coefficient of X1? Why/why not and how?
|
Depends on whether X2 is associated with X1. If it is, then adding X2 would probably reduce the coefficient on X1. Including X2 as a control is a way to avoid giving X1 explanatory "credit" for variation in Y that can also be explained by X2.
|
|
Is X1 a statistically significant predictor of Y? How do you know?
|
If the p-value is less than .05, then X1 is a statistically significant predictor of Y.
|
|
What does the coefficient on _ mean?
|
This is the estimate of the relationship between X1 and Y.
USE "predicted" "expect" "estimated" e.g. For every additional unit of X1, we expect Y to increase by _. |
|
Is the relationship between X1 and Y causal?
|
Use theory, but if not, this is likely due to X1 being associated with other factors that cause Y.
|
|
Suppression
|
Omitting a variable from the regression model CAN suppress the estimate of an independent relationship
|
|
How do we know that there is actually a difference in proportions?
|
If 0 was the true difference, it would be very unlikely that we would find a difference (T-value) standard errors from 0 be chance
|
|
Demand Effect
|
Performing an experiment creates results e.g. a researcher unconsciously changing their behavior to get the result they want - considered a confounding variable
|