Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
141 Cards in this Set
- Front
- Back
basic stats lecture
|
..
|
|
population
|
The entire set of things of interest
|
|
parameter
|
property descriptive of the population
ex) pop mean |
|
sample
|
part of the population
typically this provides the data we will look at |
|
estimate
|
property of a sample
ex) sample mean |
|
descriptive stats
|
Summarize/describe the properties of samples (or populations when they are completely known)
|
|
inferential stats
|
Draw conclusions/make inferences about the properties of populations from sample data
|
|
variable
|
-varies
-condition or characteristic that can have different values |
|
types of variables (6)
|
can classify as:nominal, ordinal, interval, ratio
or classify as: dependent, independent |
|
categorical (discrete/qualitative) variables (2)
|
nominal and ordinal
|
|
numerical (cont/quantitative) variables (2)
|
interval
ratio |
|
dependent variables (Y)
|
outcome/response
predicted variables continuous (normally distributed) |
|
independent variables (X)
|
factors in experiments
predictors/covariates categorical/continuous |
|
(3) measures of central tendency
|
mean
median mode |
|
mean is affected by ___values
|
extreme/outliers
|
|
is median affected by outliers
|
no
|
|
is mode affected by outliers
|
no
|
|
can you have no modes
|
yes
|
|
can you have multiple modes
|
yes
|
|
mode used for ___or___data
|
numerical
categorical |
|
(3) measures of variation
|
range
variance SD |
|
measures of variation give info about __or___
|
spread or variability
|
|
range
|
simplest measure of dispersion
largest value-smallest |
|
average of squared deviations of values from the mean
|
variance
|
|
most commonly used measure of variation
|
SD
|
|
SD shows
|
variation about the mean
has same units as original data |
|
shape of distribution
|
describes how data distributed
symmetric or skewed |
|
left skewed (negatively skewed)
|
median>mean
|
|
right skewed (positively skewed)
|
mean > median
|
|
in most experiments Y(dependent variable) assumed to be continuous and
|
normally distributed
|
|
normal distribution
|
-mean=median=mode
-mean(μ ) and SD(σ) sufficient to describe normal dist -interval: μ ± 1σ contains 68% of the values -interval: μ ± 2σ contains 95% of the values -interval: μ ± 3σ contains 99.7% of the values |
|
session 2
hypothesis testing: comparing one/two means |
..
|
|
hypothesis=
|
Answer to a research question or assumption made about a population parameter
|
|
step 1 for hypothesis testing
|
make a null hypothesis (Ho): no effect
make alternative hypothesis (H1): some effect |
|
step 2 for hypothesis testing
|
choose alpha (sig level)
use .05 (or .01 if told otherwise) |
|
cutoff sample score for α is called the
|
critical value
|
|
step 3 for hypothesis testing
|
look at the data and compute test statistic:
-if you have one mean and σ is known use z -if you have one mean and σ is UNknown use t -if you have 2 means use t -more than two means use ANOVA |
|
step 4 for hypothesis testing
|
reject or not reject the null:
-Compare the calculated value of your test statistic to the (tabled) critical value for α If your value is greater that the critical value, reject Ho, otherwise accept Ho -OR look at p value of test statistic value and if p value< .05 reject Ho |
|
if Ho is rejected you can conclude that
|
there is a statistically significant effect in
the population |
|
does a significant effect indicate the effect is important or meaningful
|
no
need to calculate effect size to do so |
|
Pearson’s correlation coefficient (r) (effect size
correlation) Omega (ω) Cohen’s d |
Commonly used measures of effect size
|
|
r = ___ (small effect)
r = ___(medium effect) r = ___ (large effect) |
.10
.30 .50 |
|
z-test purpose
|
test whether a sample mean significantly differs from a population mean (μ).
|
|
Prior Requirements/Assumptions for z-test (3)
|
The population is normally distributed.
The mean (μ) and standard deviation (σ) of the population must be known. The sample must be a simple random sample of the population. |
|
Technically, the z-test for a single mean is equivalent to calculating the ____of your sample mean
|
z score
|
|
can convert sample score to standard score (z) which follows___distribution
|
standard normal (μ=0, σ=1)
|
|
purpose of the z statistic is to
|
transform any normal distribution to the standard normal distribution
*shape does NOT change just units |
|
z=+/-1.96
|
alpha=.05
|
|
z=1.96
z=-1.96 |
2.5%
2.5% |
|
Hypothesis testing about a single mean with Z-test
|
-if z-score in absolute value larger than 1.96 you may reject the null with sig level of .05
|
|
limitations of z test
alternative? |
-Knowing the true value of the standard deviation (σ) of a population is unrealistic (unless entire pop. known)
-t-test |
|
purpose of t-test
|
to test whether a sample mean significantly differs from a population mean (μ).
|
|
Prior Requirements/Assumptions for t-test
|
The population is normally distributed.
The mean (μ) of the population must be known. The sample must be a simple random sample of the population. |
|
t-statistic is obtained by replacing σ with s (sample SD) this replacement causes what
|
t-stat no longer follows standard normal dist but now follows a t-dist
|
|
t-distribution varies in shape according to
|
-degrees of freedom (DF= N-1)
*as DF increases t-dist approaches standard normal dist (DF=30 is basically standard normal dist) |
|
t approaches z as __increases
|
N
|
|
If your calculated t-value in absolute value is ___ than the critical value from the table, you may reject the null hypothesis with the significance level of .05
OR if p-value/sig level of t-value is____than .05 |
larger
less |
|
purpose of t-test for 2 means
2 samples may be either __or___ |
-test whether two unknown population means (μ1 and μ2) are different from each other based on their samples
-independent or correlated |
|
hypothesis for t-test with 2 means
|
Ho: μ1 = μ2
H1: μ1 ≠ μ2 |
|
t-test for two independent samples
Prior Requirements/Assumptions (3) |
Both populations are normally distributed.
The standard deviations (σ1 and σ2) of the populations are the same. Homogeneity of variance (σ1 = σ2) Each sample must be a simple random sample of the population. |
|
limitation of t-test
alternative? |
-only used for hypothesis testing one or 2 means
-ANOVA |
|
one-way ANOVA
|
...
|
|
factor=
|
Independent variable
|
|
different values or categories of the independent variable/factor=
|
levels
|
|
what experiment involves a single IV with two or more levels
|
single factor
|
|
(2) single factor experiments
difference between them |
One-way independent-groups design--> each group has different subject
One-way design with repeated measurements--> each group has same subjects |
|
factorial experiments
|
Involve more than one independent variable with two or more levels.
|
|
Two-way independent-groups designs:
if experiment has 2 factors with 2 levels each= |
2x2 experiment
|
|
purpose of one way anova
|
to test whether the means of k (≥ 2) populations significantly differ
|
|
hypothesis of one way anova
|
Ho : μ1 = μ2 · · ·= μk
H1 : Not all μ’s are the same (at least one of the means is different) |
|
one way anova
Prior Requirements/Assumptions (3) |
The population distribution of the
dependent variable is normal within each group. The variances of the population distributions are equal (homogeneity of variance) Independence of observations |
|
2 sources of variance in anova
|
Vb
Vw |
|
variance due to different treatments/levels of a factor across groups=
|
variance between groups (Vb)
|
|
random fluctuations of subjects within each group=
|
variance within groups (Vw)
|
|
F statistic/ratio
|
-Vb/Vw
-When Ho is true, this ratio is expected to be equal to 1 -When H1 is true, this ratio is expected to be greater than 1. |
|
F stat follows___dist
this dist varies in shape according to (2) |
F
DF (b) and DF(w) |
|
the F dist is a _____skewed distribution used most commonly in ANOVA
|
right
|
|
if your calculated F value is greater than the critical value you may reject the null at .05 OR look at p value of calculated F value and if ___than .05 reject the null
|
less
|
|
With only two groups, either a _____ or an F test can be used for testing the significance of the difference between means.
|
t test
*both lead to same conclusions In fact, when k = 2, t = ±√F |
|
in ANOVA
V= |
SS(sum of squares)/DF
|
|
Total SS=
|
variation
SS(T)= SS(B)+SS(W) |
|
Vt, Vb and Vw are often called the total, between group, and within-group Mean Squares, abbreviated
by |
MS(T), MS(B) and MS(W)
|
|
The (overall) ANOVA test doesn’t tell which means are different so we perform _____test
|
post hoc comparison
|
|
____test gives a global effect of the independent variable (factor) on the dependent variable (omnibus or overall test)
|
F-test
|
|
Post hoc (a posteriori/unplanned) comparisons
done _____the experiment used if |
after
3+means compared |
|
2 post hoc tests
|
scheffe
Turkey HSD |
|
scheffe test
|
use when groups have different sizes
most conservative test (unlikely to reject Ho) |
|
tukeys HSD test
|
used if groups have equal sizes
|
|
The minimum absolute difference
between two means required for a significant difference. |
HSD
|
|
Turkey-Kramer Test
|
when sample sizes are unequal
|
|
For Tukey:
observed Q compared against critical value of Q (CQ) for .05 and reject Ho when OR |
-observed Q value is greater than the critical
value -reject null if mean differences>HSD |
|
steps for Tukey HSD
|
step 1: ANOVA
step 2: calc differences in means step 3: CQ step 4: calculate HSD *when comparing HSD to mean differences , put mean differences into absolute values reject the null if HSD less than mean differences between 2 groups |
|
(4) ways to assess normality
|
-look at descriptive stats ie skewness
-construct charts/histograms/normal quantile plot -K-S test -Shapiro-Wilk test |
|
limitation of normality tests
|
It is very easy to give significant results when sample size is large
|
|
(2) ways to assess homogeneity of variance
|
Fmax test of Hartley
Levenes Test |
|
Serious violation of homogenity of variance tends to
inflate the observed value of the_____ |
F statistic ie) too many rejections of Ho (high Type I error)
|
|
Fmax test of Hartley (3 steps)
|
1) Calculate the sample variance for each group,
and find the largest and smallest variances 2) Fmax= max V/min V 3) observed Fmax value is compared against a critical value of this statistic (if the observed Fmax value exceeds the critical value, we may reject the null hypothesis that the variances are identical across groups) |
|
Tests the null hypothesis that the population
variances are equal |
Levene’s test
|
|
If Levene’s test is significant (p < .05), then we may conclude that the variances are
|
significantly different
|
|
F-test is robust against the violation of this assumption, for homogeneity of variances, when samples are
|
of equal size
|
|
knowing the value of one observation gives no clue as to that of other observation=
|
Independent observations
|
|
the most crucial assumption underlying the F test
|
independence of observations
|
|
experiments should be carefully designed to avoid non-independent observations by random sampling&assignment, why?
|
-no easy way to fix F test when assumption of independence violated
-no easy test for non-independence |
|
if 3 assumptions for ANOVA not met a _____may be useful
|
data transformation
|
|
data transformation makes data less____
makes heterogeneous variances more___ |
skewed
homogeneous |
|
If data transformation doesn’t work, consider ____tests, i.e. Kruskal-Wallis ANOVA.
|
nonparametric
|
|
session 5 two way anova
|
...
|
|
in Two Way factorial experiments assume (2)
|
subjects serve only in one of the
treatment conditions (independent-groups design) sample sizes are equal in each condition (balanced design). |
|
Two Way factorial experiments
|
-2 IV's (Row, Column)
-more than 2 means |
|
Two Way factorial experiments: when each factor has 2 levels we call it a _________design
|
2 x 2 factorial design
|
|
two-way factorial experiment contains information about (2)
|
two main effects
interaction effect |
|
main effects
|
-effect of one factor when the other factor is ignored (by averaging the means over all levels of the other factor) ie) 2 IV's effect on DV
-differences among marginal means for a factor |
|
interaction effect
|
-extent which the effect of one factor depends on the level of the other factor ie) interaction between different levels of the 2 IV's
|
|
An ____is present when the effects of one factor on DV change at the different levels of the other factor
|
interaction
|
|
presence of an interaction indicates that the main effects
|
alone do not fully describe the outcome of a factorial experiment
|
|
Two way ANOVA
Prior requirements/assumptions (3) |
The distribution of observations on the
dependent variable is normal within each group (normality). The variances of observations are equal (homogeneity of variance). Independence of observations. |
|
hypothesis for main effects
|
-row main effect
Ho: μR1 = μR2 = · · · = μRr (equal row marginal means) H1R: Not all μR are the same -column main effect Ho: μC1 = μC2 = · · · = μCc (equal column marginal means) H1C: Not all μC are the same |
|
hypothesis for interaction effect
|
-HoRC: The interaction between R and C is
equal to zero (e.g., RC = 0) -H1RC: The interaction between R and C is not zero (e.g., RC ≠ 0) |
|
in Two Way ANOVA you partition SS(B) into (3)
|
SS(R)=variation between row means
SS(C)=variation between column means SS(RC)=variation between cell means |
|
when 2 way ANOVA results significant use ___tests to find significant main effect and ____analysis to determine significant interaction effect
|
post hoc
simple effect |
|
simple effect analysis
|
effect of one factor at each level of the other factor
-effects of rows at C1 -effects of rows at C2 -effects of columns at R1 -effects of column at R2 |
|
calculating simple effects
|
-we can apply a one-way ANOVA for one factor
repeatedly at each level of the other factor -then when calculating F ratio use MS(W) from the two way ANOVA in the denominator |
|
session 7 one way repeated measures ANOVA
|
..
|
|
repeated measures design
|
-measurements on single DV repeated a number of times within same subject
-ex) 2 conditions, before and after treatment -N subjects measured on single DV under K conditions/levels of a single IV |
|
One-way repeated measures ANOVA used for (2)
|
-examine effect of treatment (IV) (between group effects, Ho:μ1 = μ2 · · · = μk )
-test the effect of subjects (between subject effect, Ho: V=0) |
|
are we interested in effect of subjects or subject level variability
|
-not really
-if effect it significant it just means subjects differ but has nothing to do with the IV |
|
prior assumptions (3) for one way repeated measures ANOVA
|
-distribution of observations on the DV is normal within each level of the treatment
factor. -variances of observations are equal (homogeneity of variance) at each level of the treatment factor -population covariance between any pair of repeated measurements is the same (homogenous covariance) |
|
hypothesis in one way repeated measures ANOVA
for between group effect |
Ho: μ1 = μ2 · · · = μk
H1: Not all μ are the same |
|
hypothesis in one way repeated measures ANOVA for subject effect
|
Ho: V=0
H1: V≠ 0 |
|
in repeated measure designs is independence likely to hold
|
nope
|
|
when both homogeneity of variance (equal variance) and homogeneous covariance (equal covariance) assumptions are met we describe this as ____symmetry
|
compound
|
|
in one way repeated measures
k= n= |
number of conditions/levels of IV
number of subjects |
|
One-way repeated measures ANOVA:
SS(B)= SS(S)= SS(T)= SS(BS)= |
-Variation between group means (treatment)
-Variation between row means (= subject means) -total sum of the squared difference between all observations and the grand mean -SS(T) - SS(B) - SS(S) |
|
_____is a more general condition of CS
|
Sphericity
|
|
Mauchly’s W test
|
determines if sphericity/CS has been violated
if p<.05 CS is violated |
|
When CS assumption is violated, the omnibus F tests in one-way repeated measures ANOVA tend to be inflated, leading to more
|
false rejections of Ho
|
|
how to deal with CS violations
|
-use a conservative critical value based on the possible violation of CS
-inflation of the F test can be adjusted by evaluating it against a greater critical value, obtained by reducing the degree of freedom *know as conservative F-test |
|
DF(B) = E(k-1) and DF(BS) = E(k-1)(n-1), E measures
when CS holds E= when CS is violated E is___than 1 |
extent to which CS is violated
1 less |
|
Geisser & Greenhouse & Huynh-Feldt estimates
which is smallest (more conservative) |
SPSS estimates of E
Geisser & Greenhouse |
|
DF (B) and DF (BS) are reduced when
|
CS is violated
|
|
when CS is violated get a ___critical value for F
|
larger
|