Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

image

Play button

image

Play button

image

Progress

1/63

Click to flip

63 Cards in this Set

  • Front
  • Back
Research design typically has two categories:
1.    experimental design (now).
2.    correlational design (next).
True Experiments:
1.    state at least 1 hypothesis about IV’s effect on DV(s)
2.    have at least 2 levels of an IV
3.    assign participants to conditions randomly
4.    use specific control procedures for testing hypotheses
5.    use controls for internal validity threats
Variance…
Defined, as before, as SS/df…
Goals for experimental design:
--answer questions by testing causal hypotheses
--control variance to increase internal validity
--be confident that the variation between experimental conditions is due to the manipulated IVs
Variance…
Systematic between-groups variance
necessary for determining causal differences
good and bad types, both change participant scores in 1 direction.
--experimental var + extraneous variance
Variance
The only way we can be confident about our decision of rejecting the null hypothesis is if we’ve used proper control procedures,
e.g., automation, double-blind, deception, elimination, constancy, build a variable into the design
Non-systematic within-groups variance
due to random factors (e.g., participant fatigue, equipment fluctuation), individual differences
chance error affects Ss in both directions, so effects tend to cancel out each other (e.g., high and low energy participant effects tend to balance out)—but this doesn’t mean that chance error should be ignored
F-Ratio and Variance
How do A and B relate to each other? When we compare groups in a statistical analysis, we take a ratio of the two, which we call “F” (we’ll use this in ANOVA) F=BtwnGroups Var / W/inGroups Var
We want “F” to be large, due to
maximizing between-groups variance,
controlling extraneous variance, and
minimizing error variance (use large N, matched Ss, reliable tests).
Variance…
Conceptually, “F” is broken into this ratio of variances:
experimental + extraneous + error / error
--Error is included in both numerator and denominator because it will always be present (G&R: “Suppose there are no systematic effects. In this case, both the numerator and denominator represent error variance only, and the ratio would be 1.00.” Misleading! F is often BELOW 1.00. Use “1.00” as a conceptual guideline.)
Experimental variance—MAXIMIZE!
use at least 2 IV levels (at least trtmt and ctrl)
use “manipulation checks” (self-report measures of manipulation given—e.g., did you use the strategy procedure we gave you?)
Extraneous variance—CONTROL!
Equate trtmt & ctrl grps before start of experiment
(i.e., completely equal)
2.    Random assignment (any differences have an equal
chance of appearing across groups)
3.    Elimination & constancy (Elim—remove a particular age group; Constancy---use only 20-yr-olds)
Extraneous variance—CONTROL! cont.
Build the variable into the design
(e.g., if sex differences exist for spatial ability, use “sex” in experimental design of spatial study)

5.    Match participants across groups
(e.g., age, IQ, SES)

6.    Use participants as their own control (Within-S design or repeated measures design; use a dependent means t-test if two times of measurement are used.
Error variance---MINIMIZE!
--chance variation due to individual differences, random fluctuations in test instruments, etc.
1.    Use reliable test instruments
consistent tests will minimize error, have better
chance of measuring “something”
Error variance---MINIMIZE!
Use matched-Ss or W-S designs
Ss serve as their own control; persons are more consistent within themselves than across others
Non-experimental designs
These designs DON’T have all the 5 characteristics of a true experiment [EPF -- 1GP --- 1GPP -- PPNC]
examples: ex post facto; 1-grp, post-test only; 1-grp, pre-test post-test; pre-test, post-test, natural control-grp design
Ex post facto -- “after the fact”
Group ---- Event ---- Measurement
Present observation is made
Past info not observed
Causal inferences difficult to make
Problem: No controls for confounding factors
e.g., therapy situations; Do abused persons become abusers? Maybe there is a link, maybe not.
1-group, post-test
1-group, post-test only
manipulation is made
Group ---- Trtmt ---- Post-test
e.g., Memory training given to a group (no pretest, no ctrl!) Why did scores look good/high?
Problems: Expectancy, Hawthorne Effect, placebo, history, maturation, regression to the mean?
1-group, pretest---posttest
1-group, pretest---posttest
Group ----Pretest ---- Trtmt ----Posttest
Compare Pre- and Post- scores
e.g., better situation for memory training experiment
Problems: maturation, regression to the mean, placebo, history
Pretest – posttest, natural control group
[PPNC]
Pretest – posttest, natural control group
Grp 1 ---Pretest --- Trtmt -------Posttest
Grp 2 ---Pretest ---no Trtmt ----Posttest
Compare posttest scores
Use naturally occurring groups
e.g., sororities, schools, organizations; no random assignment is used
Pretest – posttest, natural control group…
e.g. Differences in assertiveness due to training
Sorority 1 --- q-aire --assert training ---Posttest
Sorority 2 –- q-aire –- no training -------Posttest
Compare posttest scores
Pretest – posttest, natural control group…
Problem: How do we know that pre-existing differences weren’t present in the two groups? More social cohesiveness, more experience with training, etc.
Better to make random assignment within sororities to ensure group equivalence.
This way, even if we’re using pre-existing groups, we can control for other differences between them.
V. Experimental Designs
have the 5 characteristics of a true experiment [RPOC, RPPC, MLRB, Solomon]
A.   Randomized Posttest only Control Group design
R Grp 1Trtmt Posttest // R Grp 2 No trtmt Posttest
Experimental Designs
Compare posttest scores
Random assignment controls threats to internal validity
Random selection controls threats to external validity

Using control groups protects regression to the mean, history & maturation; instrumentation
Experimental Designs
Randomized Pretest-posttest Control group design
R Grp 1 Pre T Post
R Grp 2 Pre NT Post
Compare posttest scores
Better than the natural control group design just mentioned; RANDOMIZATION occurs! better than the randomized posttest only control group design; PRETEST occurs!
 
***Problems: Receiving the pretest affects Ss somehow; sensitizes Ss to study purposes; “interaction effect”
Multilevel, completely randomized, between-subjects design
extension of earlier designs; 3 or more conditions, pretest not always included
e.g., Give different amounts of practice for solving a cognitive task:
0 practice trials for Group 1
5 trials for Group 2
10 trials for Group 3, etc.
Controls for validity threats via random assignment
D.   Solomon’s 4-group design
controls for interaction of pretest and trtmt effect
very extensive, well-developed
most imp. comparison is between Grp 1 & 2’s posttests
--R Grp 1 Pre T Post
R Grp 2 Pre Post (history & matur)
R Grp 3 T Post (interact’n; grp 1)
R Grp 4 Post (maturation)
e.g., attitudes towards tobacco use; T = videotape
Conditions where exp. designs with a single variable are not appropriate
A.   When randomization is NOT possible
e.g., gender of participants
Conditions where exp. designs with a single variable are not appropriate
B.   Variables rarely affect behavior singly; mostly more than 1 variable affects behavior simultaneously

e.g., speed, attention, memory, motivation, strategy use affect cognitive performance
Conditions where exp. designs with a single variable are not appropriate
  Variables may INTERACT with each other
e.g., Differences between levels of factor 2 for a level of factor 1 may be different for the other level of factor 1. (sound familiar?)
hence, the need for factorial designs…
Experimental research design distinctions
A. Dependence
A.   Dependence
1.    Independent groups = Between-subjects designs
(Different persons are in each group)
2.    Correlated groups = Within-subjects designs (Same or matched persons are in each group)
B.   Single/multiple IVs
1.    Single variable designs = univariate (1 IV)
2.    Factorial designs = multivariable (2 or more IV’s) e.g., 2 x 2 factorial design
Single-Variable, Correlated Groups Designs
Correlated groups designs do not use free random assignment to conditions
One type of correlated group designs is within-subjects, or repeated measures designs
Another type is called matched-subjects designs
An extension of the within-subjects design is a single-subject experimental design; we’ll discuss 3 types of these shortly
Within-subjects designs
Within-subjects designs involve the repeated measurement of the same person
Each person is given all levels of the IV, like this Working Memory task example, where each person is given 3, 4, 5, 6, and 7 words to retain...
Within-subjects designs…
The analysis involves comparing differences between the correlated groups. For this example, one would conduct a dependent
means t-test on the number of items recalled. To be sure the IV affects the DV, one should counterbalance the order of exposure to conditions
This protects against sequencing effects such as practice and carryover
Within-subjects designs…
Strengths:
No pre-existing differences between groups, since the groups are the same people; this helps protect internal validity
W-S design eliminates error variance due to individual differences
Saves time; bring in only 1 set of participants
Need fewer N
Within-subjects designs…
Strengths
No pre-existing differences between groups, since the groups are the same people; this helps protect internal validity
W-S design eliminates error variance due to individual differences
Saves time; bring in only 1 set of participants
Need fewer N
Weaknesses:
Sequence effects can be problematic; Permanent changes are not reversible (e.g., surgery, knowledge/attitude change)
Two types of sequence effects:
Practice effects; can create positive, enhanced change, or create negative change, such as fatigue; not necessarily due to a particular order of experiences
Carry-over effects; a previous trial influences the next, because of the nature of the previous trial; effects may not be “even” for all orders of presentation
Controls for sequence effects:
Present conditions randomly to participants, to “even out” the confounds across conditions
Systematically present the conditions by complete counterbalancing, so that each condition appears in a particular order
Latin-square designs; this presentation order involves partial counterbalancing the conditions such that each condition appears only once in a row, and once in a column. Other controls include:
- holding a variable constant (by training to a criterion before the experiment begins)
- allow a break to control fatigue
- skip the W-S design and use a Between-Subjects design instead
Matched-Subjects Designs
Matched subjects involves using different persons in each condition, but these persons share characteristic(s) based upon an important consideration of the study’s goals
E.g., age, visual acuity, SES, GPA, gender
Matched-Subjects Designs…
Participants each get only one level of the factor
Use a dependent means t-test if only 2 IV levels are used
Use a repeated measures ANOVA if more than 2 IV levels are used
With either analytic procedure, keep track of who is linked to whom!
Match-Subjects Designs…
Strengths include:
Strengths include:
the ability to use a smaller N with a given statistical power level, relative to a Between-Subjects design
no sequence effects
[smaller n req but usu req oversampling!]
Matched-Subjects Designs…
Weaknesses
Greater effort for the researcher to find an effective matching characteristic, then find the right matches for each participant
Usually requires over-sampling
Single-subject designs
Experimental (an IV is manipulated), used mainly in clinical settings; good for treatment evaluation
-We use these when we want to measure change within a person
-And, obtain information otherwise lost in group comparison (ex. if average effect cancels out even though indiv increases)
Single-subject design Types
Types [ABAR -- MB -- SSRTS]
ABA + [reversal]
Multiple baseline
Single-subject, randomized, time-series designs
Single-subject designs
A. ABA + [reversal]
Baseline, Treatment, No treatment, Treatment
*Not good if detrimental to health to stop suddenly.
Single-subject designs
--Multiple baseline
Use if we don’t want to return to a negative
Baseline (danger, injury, ethics)
Use treatment on different behaviors successively
E.g., Teacher wants to see child do better in math and reading and have less disruptive behavior. Start only reading help, then reading and math help.
Single-subject, randomized, time-series design
Randomly pick a point in which a reinforcement or change is given to subject
If change occurs, it’s unlikely due to chance, maturation, or history. *Randomly start stimulus change in 6th trial, 2nd, etc.
Introduction to ANOVA
You must use if it you desire the comparison among 3 or more means…(not multiple t-tests)
Types of ANOVA
1. Between-subjects ANOVA (like what you’ll be doing in lab)
Within-subjects ANOVA
Or repeated measures ANOVA, where we desire to know how an individual changes over time, or across conditions.
e.g., give an “Attitude Towards Pres. Bush” scale (ATPBS), at 3 time points
How are the (between) groups defined?
Groups are defined by factors—those independent variables we manipulated (like amount of reward given), or are known to differ (like gender)
We require you to manipulate both factors in your final project, although in practice outside of 306, one might decide to use pre-existing variables as factors, like gender, sorority, school, car ownership, educational level, etc.
A general way to describe these designs is as follows:
Three-way ANOVA---three factors etc.
A more specific way to describe these designs is as follows:
[Design 'way' defined by number of numbers]
2 x 2 ANOVA -> a two-way design, each factor having 2 levels //
3 x 3 x 4 ANOVA -> a three-way design, first & second factors have 3 levels; third has 4 levels. //
2 x 2 x 3 x 3 ANOVA ? a four-way design, first & second factors have 2 levels, third & fourth factors have 3 levels
A even more specific way to describe these designs is as follows:
One-way between-subjects ANOVA manipulating 4 levels of stimulus brightness
2 x 2 between-subjects ANOVA, manipulating reward (low, high), and punishment (no, yes)
3 x 3 x 4 between-subjects ANOVA, manipulating reward (low, medium, high), punishment (low, medium, high), and stimulus brightness (very dim, dim, somewhat bright, very bright)And, a 2 x 2 x 3 x 3 (why do this kind of design??)…
Determine the characteristics of the comparison distribution…
F varies with df for the numerator and denominator
F is positively skewed, since the distribution is a distribution of variance ratios…variances can only be positive
comparison distribution cont (X – GM, X – M, M – GM)
All F-tests are one-tailed, but with an omnibus F-test, we don’t express “direction” of expected differences (even if we have one!). The F-test doesn’t indicate which means differ other than the largest and smallest means.
F-tests can only be positive
The “direction” of expected effects must be examined at the multiple comparison step.
Multiple Comparisons
When the omnibus F-test (one-way ANOVA F) is statistically significant, we may want to know which means are statistically different; so far, we know only that at least two means differ.
We can do multiple comparisons, in which several pairs of means are tested
Multiple Comparisons
So, power-wise, it’s best to be able to conduct planned comparisons
Planned, ‘a priori’comparisons [B&F]
e.g., Bonferonni
e.g., Fisher’s LSD
Easier to achieve a statistically significant results because they’re planned tests
--Unplanned, ‘post-hoc’ tests [S&T] - e.g., Scheffe F-tests (*Most conserv)
- e.g., Tukey (*next most conserv)
- a-value is cut into many pieces, so it’s more difficult to get a statistically significant result; e.g., alpha of .05 / 5 tests = .01
Multiple Comparisons
With Fisher’s test, one can make up to 3 planned comparisons without needing to adjust alpha (no matter how many means you have available).
Therefore, be selective about which means you want to test specifically.
Multiple Comparisons
As hypothesized, the results of Fisher’s LSD tests showed that both verbal (Mv = 30.00, SDv = 2.00) and verbal-pictorial (Mv&p = 40.00, SDv&p = 2.00) teaching methods were superior to no teaching method (Mzip = 26.00, SDzip = 2.00) , F(1,6) = 6.00, p < .05, F(1,6) = 73.50, p < .01, respectively; In addition, the verbal-pictorial method was found to be statistically significantly better than the verbal method alone, F(1,6) = 37.50, p < .01.”
Multiple Comparisons
With Fisher’s LSD test, we compare 2 means at a time, but the pooled variance is MSW , the variance estimate from all groups

This pooled variance is more stable and powerful than the pooled variance from t-tests
Tukey– conservative, and
good if # of groups is large.

Scheffe – more conservative; F critical is adjusted to by a (k-1) multiplier.

Fisher’s LSD -- pretty liberal
Bonferroni – conservative; adjusts alpha for comparisons.
Not good to use if # of groups is large.
Sidak – More conservative than Bonferroni.
***Bad that SPSS gives so little in the table; you know nothing about the alpha adjustment, adjusted F, or the computation of the ratio.
Effect Size = f
Cohen’s Effect Size conventions are a bit different for ANOVA:
f
Small .10
Medium .25
Large .40
Effect Size and Power
Effect size = R2 =Proportion of Variance Explained
This effect size calculation must be used when n is unequal;
It can be used for equal n cases, too
R^2 = SS Btwn / SS T
R2 Effect Sizes
Small = .01
Medium = .06
Large = .14
Too liberal.
% of the variance in the DV can be explained by the IV;