Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
64 Cards in this Set
- Front
- Back
why do we conduct statistical analyses?
|
to assess if the IV had any effect on the DV (assess systematic variance)
|
|
even when confounds are controlled for the means of 2 groups can be significantly different due to..
|
error variance
|
|
we can conclude that the IV has an effect when
|
the differences between the means of the experimental condition is LARGER than what we expect it to be solely due to error variance
|
|
the key word with inferential statistics is...
|
probability: that the difference we observe b/w the means is due to error variance
|
|
rejecting the null hypothesis means...
|
the IV did indeed affect the DV
|
|
failing to reject the null hypothesis means...
|
the IV had NO effect on the DV
|
|
4 possible outcomes for deciding if IV had an effect
|
1) correct decision:
a. reject null b. fail to reject null 2) incorrect decision: a. Type I error b. Type II error |
|
type I error
|
researcher erroneously concludes null hypothesis is false and rejects it (incorrectly reject null)
--> you say IV had affect on DV when it did not |
|
type II error
|
researcher mistakenly fail to reject null when it is false (incorrectly accept null)
-->say IV had no affect on DV when really it did |
|
when do you use a t-test and an f-test?
|
t-test: with 2 groups
f-test: with more groups than 2 |
|
what does a t-test do?
|
error variance in the data is calculated to determine how much the means are expected to differ solely due to random chance
|
|
if absolute value of the calculated t-value exceeds the critical value of t obtained from the table, that means...
|
statistically significant (reject null)
|
|
directional vs nondirectional hypotheses: define and what type of test is appropriate
|
-directional: states which of the two condition means is expected to be larger... uses one-tailed t-test
-nondirectional: only states that the 2 means are expected to be different... uses two-tailed t-test |
|
paired t-test: what does it take into account? what does it due overall?
|
-takes into account the fact that the participants in 2 conditions are similar (/identical) on an attribute related to the DV
-reduces the estimate of error variance used to calculate t, which leads to a more powerful test of the null hypothesis |
|
computer analysis- 2 good and 2 bad things
|
-greatly increased speed & flexibility, allow to conduct preliminary analysis to examine data quality
-problem of data entry and problem of improper use of statistical tests |
|
alpha level- define, what is it set at, and aka
|
the probability of making a type I error
-set at .05 -aka the p-value |
|
beta level
|
the probability of making a type II error
|
|
ways Beta can be increased
|
-improper measurement of DV
-unreliable measurement technique -mistakes in collecting, coding, analyzing data -too few participants to detect effect of IV -very heterogeneous samples -poor experimental control |
|
you reduce the likelihood of type II errors by designing experiments that have...
|
high power (power= probability study will correctly reject null when it is false)
|
|
power can be thought of as
|
the opposite of beta (1-beta)
|
|
power is related to..
|
the number of participants in a study
|
|
a power analysis can be conducted prior to research in order to determine..
|
how many participants are needed to detect the effects of the IV
|
|
power of .8 means...
|
you have an 80% chance of detecting an effect that is there, and 20% chance of a type II error
|
|
effect size- definition & why done
|
-the proportion of variability in the DV that is due to the IV
-done b/c when we reject the null and conclued the IV had an effect on the DV we'd like to know how strong it is |
|
the more t-tests you do...
|
the more chance one or more is wrong (more likely to have type 1 error)
|
|
for every t-test, what is that chance of being wrong?
|
5%
|
|
type I error increases as
|
we perform more t-tests
|
|
formula for the probability of making a type I error
|
1-(1-alpha)^c
c= number of tests alpha= usually .05 |
|
you can prevent type I error by using.. (2 options)
|
-bonferoni adjustment (not used often)
-analysis of variance (preferred) |
|
bonferoni adjustment: definition and problem w/ it
|
-divides desired alpha level by # of tests (to prevent type I error)
-more stringent alpha increases risk of type II error |
|
when the number of t-tests is large, to prevent type I errors we prefer to use
|
analysis of variance (ANOVA)
|
|
ANOVA- when used? why? what does it do?
|
-used to analyze data from designs involving more than TWO conditions
-we do this b/c it analyzes differences b/w all condition means in an experiment simulataneously -determines whether any set of means differs from another using a single test that holds the alpha at .05 (rather than doing each t-test separately) |
|
variance between groups (__) and within groups (__)
|
-variance b/w groups= IV
-variance w/in groups= error variance |
|
if variance between groups experimental conditions is markedly greater than within groups
|
we conclude the IV is having an effect (IV greater than error variance)
|
|
in a study which IV has NO EFFECT on the DV, we can estimate error variance in data in 2 ways
|
1) look @ variability among participants w/in each condition
2) look @ differences b/w condition means |
|
when IV has no effect, variability among condition means and variability within groups are reflections of
|
error variance
|
|
f-test
|
ratio of the variance between groups to the variance w/in groups (F= between/within)
|
|
if the IV has no effect, in an f-test....
|
the F should be around 1.00 (numerator and denominator should be estimates of same thing)
|
|
if the IV has an effect, in an f-test...
-what does it depend on? |
we expect the numerator to be larger (b/c it contains systematic variance in addition to error variance)
-how much larger depends on critical value |
|
sum of squares between-groups... if IV has no effect, then
|
we expect group means to be equal, aside from what differences are due to error variance and any group mean should be equal to mean of all group means (grand mean)
|
|
in ANOVA we know that... BUT...
|
-one group significantly differs from another, but not which groups if we have more than 2 IV's
-to identify which means differ we conduct post-hoc tests |
|
in addition to main effects, an ANOVA can tell us about... (& how would we interpret that)
|
interactions (look @ simple main effects in order to interpret interactions)
|
|
what does a MANOVA test? what 2 reasons do we use it for?
|
-tests differences between the means of 2 or more conditions on 2 OR MORE dependent variables simultaneously
-used for: 1) conceptually related dependent variables (id 10 diff depression test scores) 2) reduction of type 1 error (more tests=> more likely) |
|
a MANOVA works by creating.... (called:)
|
a composite variable that is the weighted sum of the original DVs called the CANONICAL variable
|
|
quasi-experimental design
|
-when we lack control over assignment of participants to condition and/or do not manipulate causal variable of the interest (in many cases used to indicate variable is not a true IV manipulated by researcher but an event that occured for other reasons)
|
|
quasi-independent variable
|
used to indicate that the variable is not a true IV because it has not been manipulated by the researcher
|
|
benefit and 2 drawbacks of quasi-experimental designs
|
(+): allow us to examine real-world phenomena
(-): no initial equivalence, cannot randomly assign participants to conditions and control for extraneous variables so there's a lack of internal validity (degree to which researcher draws accurate conclusions about effect of IV) |
|
4 types of quasi-experimental designs
|
-pretest-posttest designs
-time series designs -longitudinal designs -program evaluations |
|
how does a one-group pretest-posttest design work? what is wrong with it?
|
pretest-treatment-posttest
-very weak b/c lack of control, threats to internal validity |
|
what are threats to internal validity in pretest-posttest designs?
|
-maturation, history, testing effects, extraneous factors, REGRESSION TOWARDS MEAN (low scores go up, high scores go down)
|
|
what can we do to obtain more internal validity in a pretest-posttest design?
|
a control group is added--> nonequivalent control group design
|
|
2 types of nonequivalent control group designs
|
1) posttest-only design
2) pretest-posttest design |
|
problems with nonequivalent control group design
|
-no initial equivalence b/c of lack of random assignment-> true control impossible
-selection bias (don't know if groups were similar to begin w/) |
|
nonequivalent pretest-posttest design allows for
|
comparison b/w groups before treatment condition (to get baseline)
|
|
time series designs
|
-several pretests, several posttests (done to eliminate some weakness of a nonequivalent control group)
|
|
3 types of time series designs
|
1) simple interrupted: observing participants behavior several times before quasi-IV is introduced and several times after
2)interrupted time series w/ reversal: taking several pretest measures before the quasi-IV and then before/after it is removed 2)control groups interrupted: measuring more than 1 group on several occasions, only 1 of which receives the quasi-IV |
|
3 weaknesses of interrupted time series with reversal
|
1) researchers may not have power to remove quasi-IV
2)effects of quasi-IV remain after it's removed 3) removal of quasi-IV may produce unintended changes |
|
Longitudinal design- what is the quasi-IV? when is it used most often? what does it avoid?
|
-the quasi-IV is TIME ITSELF
-used frequently by developmental psych researchers to study age-related changes -avoids generational effects |
|
the goal of longitudinal design is to
|
uncover developmental changes that occur w/ age
|
|
3 most common limitations of longitudinal quasi-experimental designs
|
1) difficult to find willing subject
2)difficult to accurately track participants 3)requires great deal of time and money |
|
program evaluation
|
uses behavioral research methods to assess the effects of interventions (programs) designed to influence behavior
|
|
to infer causality we must meet 3 criteria
|
1) presumed causal variable preceded effect of time
2)cause and effect covary 3) all other explanations eliminated due to randomization/experimental control |
|
4 problems with designs that study 1 group before and after quasi-IV
|
1)history
2)maturation 3)regression to the mean 4)pretest sensitization |
|
2 problems with designs that compare 2 or more nonequivalent groups
|
1)selection bias- groups differed even before occurrence of quasi-IV
2) local history- extraneous event occurs in 1 group but not the other |