• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/39

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

39 Cards in this Set

  • Front
  • Back
Threats to causal relationships
Reverse causation, confounding variables
Confounding variables
Additional factors that might lead us to give too much “credit” to an explanatory variable.
What does the coefficient of X1 mean? (multivariate regression with X1 and X2 as variables)
This is the relationship between X1 and Y
- controlling for X2
- holding X2 constant
- controlling for the linear relationship between X2 and Y
What does the coefficient of the constant mean? (multivariate regression with X1 and X2 as variables)
This is the expected value for when both X1 AND X2 are 0.
R-squared
Measure of the proportion of the variance in the dependent variable explained by the independent variables
Calculate R-squared
(Variance of Y - Sum of Squared Residuals)/Variance of Y
When would you use an F-test?
- Nominal variables
- Variables likely to be highly correlated, but important predictors
Unrestricted model
Includes independent variables you want to test joint significance of
Restricted model
Same model, excluding independent variables to be best tested
Interpret the coefficient on X1 (multivariate regression).
The coefficient on X1 indicates that - controlling for other variables in the model - _ (e.g. for every additional unit, Y changes by _...). Because the p-value is less than .05 we can be 95% confident that this relationship is different from zero (if significant).
Dichotomous dependent variable.
Probability or percentage, use PERCENTAGE POINTS when interpreting the coefficient
Why would the coefficient on X1 drop when you add X2 and X3 to the model?
X2 and X3 are related to X1 as well as Y. Failing to account for X2 and X3 would result in overestimating the effects of X1. Without adding X2 and X3, the relationship between X1 and Y is confounded by other factors that might affect Y.
When do you log an independent variable?
When the data is skewed
How do you confirm that a variable contributes to the predictive power of the model when you are using that variable AND its squared term?
Use an F-test to check for joint significance.
F test
A way to test whether adding a set of variables reduces the sum of squared residuals enough to justify throwing these new variables into the model.
What does the F test depend on?
- How much the SSR is reduced
- How many variables you are adding
- How many cases you have to work with
Why would you use nominal rather than ordered variables (e.g. party identification).
The change from one value to another might not be the same, e.g. from -3 to -2 and 0 to 1.
What does the coefficient on X1-squared tell us?
Including X1-squared and allowing for a curved relationship between X1 and Y improves the fit of the model (if significant).
Interpret the coefficient on X1*X2.
For every one unit increase in X1, the slope of the relationship between X2 and Y will (increase/decrease) by (coefficient).
AND for every one unit increase in X2, the slope on the relationship between X1 and Y will (increase/decrease) by (coefficient).
What does it mean when the coefficient of X1*X2 is statistically significant?
This means that the slope of the relationship between X1 and Y is significantly different depending on the value of X2 (and vice versa).
Interpreting a graph of relationships involving an interaction term
- Is the slope greater for one group than the other (is the relationship stronger)
- Where is the difference between the two groups greatest?
Interpret the coefficient on X1 (multivariate regression including an interaction term)
The slope of the relationship between X1 and Y = _ when X2 = 0 (and vice versa).
What does random assignment guarantee?
Treatment and control groups will be similar except for the fact that one group is “treated” - any change will be due to treatment
External validity
Would what you find actually happen outside of the context of an experiment?
Field experiment
Intervention done while people are going about their business. PRO: external validity
CON: no experimental control, ethical issues
Pros/cons of observational analysis
PRO
- Can “find” data – don’t have to gather it yourself
- Sometimes the only reasonable approach (e.g. what causes war?)
CON
- Difficult to definitively determine causation
BUT if our model is good we can get good estimates
Pros/cons of experiments
PRO
- Identifying causality
- Eliminates confounding variables
- Random assignment, experimental control
- Detail - e.g. break down process into smaller units
- Precise measurement

CON
- External validity (can results be replicated outside of the context of an experiment?
- Unrepresentative subject pools (college students)
- Experimenter bias (e.g. demand characteristics, expectancy effects)
Experiment null and alternative hypotheses
H0: no difference between the treatment groups
HA: difference between the treatment groups
Test for difference in proportions
Difference of proportions/Standard error of this difference
CLT in relation to difference in proportions
In repeated sampling, the distribution of our estimates of the mean (or difference of means or slope) will be normally distributed and centered over the true population value
Why might the standard error of a regression coefficient be small?
Large N
Find the T-value of a regression coefficient.
Coefficient/Standard Error
How would adding X2 affect the coefficient of X1? Why/why not and how?
Depends on whether X2 is associated with X1. If it is, then adding X2 would probably reduce the coefficient on X1. Including X2 as a control is a way to avoid giving X1 explanatory "credit" for variation in Y that can also be explained by X2.
Is X1 a statistically significant predictor of Y? How do you know?
If the p-value is less than .05, then X1 is a statistically significant predictor of Y.
What does the coefficient on _ mean?
This is the estimate of the relationship between X1 and Y.
USE "predicted" "expect" "estimated"
e.g. For every additional unit of X1, we expect Y to increase by _.
Is the relationship between X1 and Y causal?
Use theory, but if not, this is likely due to X1 being associated with other factors that cause Y.
Suppression
Omitting a variable from the regression model CAN suppress the estimate of an independent relationship
How do we know that there is actually a difference in proportions?
If 0 was the true difference, it would be very unlikely that we would find a difference (T-value) standard errors from 0 be chance
Demand Effect
Performing an experiment creates results e.g. a researcher unconsciously changing their behavior to get the result they want - considered a confounding variable