Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
194 Cards in this Set
- Front
- Back
statistics is
|
a set of mathematical procedures used to summarize or draw conclusions from data.
|
|
data are
|
are values representing the measurement
of one or more variables. |
|
a variable is
|
anything that varies, or changes.
|
|
a score is
|
is a particular person’s value on a variable.
|
|
descriptive statistics are
|
procedures used to summarize and present data
|
|
inferential statistics are
|
procedures used to draw conclusions about a
population based on data from a sample |
|
a population is
|
is the complete set of all individuals
of interest in a study. Usually, it is too large. |
|
a sample is
|
is a manageable subset of the population
|
|
a parameter is
|
is a value that summarizes a
characteristic of the entire population. |
|
a parameter is
|
is constant
• Because all the data of interest is used to calculate the parameter, there is no additional data that can change its value. – But parameters are typically unknown. |
|
a statistic is
|
is a value that summarizes a
characteristic of a sample. – It is typically used to estimate the parameter of a real or hypothetical population. |
|
a statistic is
|
is variable
• Because different samples may contain different scores from the population, statistics of the same variable may differ between samples. |
|
two types of categorical scales
|
nominal and ordinal
|
|
nominal scales are
|
• values label or identify observations (e.g., male/female).
• mathematical operations: = and not equals |
|
ordinal scales are
|
• values rank order observations (e.g., letter grades)
• mathematical operations: = , not equals , < , and > • intervals are not necessarily equal |
|
two types of quantitative scales
|
interval and ratio
|
|
interval scales are
|
• values order observations into equally spaced intervals
(e.g., degrees Fahrenheit). • mathematical operations: = , not equals, < , > , + , and – • * and / not meaningful because no true zero point |
|
ratio scales are
|
• have a true zero point (e.g., inches).
• mathematical operations: = , not equals, < , > , + , - , *, and / |
|
discrete data are
|
– there are not an infinite number of meaningful values that
exist between any two neighboring values – All categorical scales produce discrete data. – Quantitative scales may produce discrete or continuous data. |
|
continuous data are
|
– there are an infinite number of meaningful values that
exist between any two neighboring values – possible values only limited by precision of measurement instrument |
|
a real limit is
|
– the range of values within which the true value
of a continuous value exists upper real limit = value + ½(measurement unit) lower real limit = value – ½(measurement unit) |
|
a measurement unit is
|
the lowest measurable value greater than zero
|
|
X and Y are
|
the scores on each variable
|
|
N is
|
total number of participants in an
entire study or population |
|
n is
|
number of participants in a condition
of a study or a single sample |
|
sigma is
|
summation (add up all the scores)
|
|
rules of priority
|
• all operations
– inside parentheses – in the numerator and denominator – under a square root • exponentiation • negation • multiplication and division • addition and subtraction |
|
frequency (f)
|
– the number of times a value of a particular
variable occurs in the data |
|
frequency distribution
|
– the complete set of frequencies for each value
of a variable |
|
making a frequency table
|
If the data are nominal,
the order of the categories is up to you. Otherwise, the values of the variable should be listed from highest ( top) to lowest (bottom). |
|
cumulative frequency (cf)
|
the frequency of scores less
than or equal to a given value |
|
proportion (p)
|
p = f/N
|
|
percentage (%)
|
% = 100(p)
|
|
cumulative percentage (c%)
|
– the percentage of
scores less than or equal to a given value c% = 100(cf/N) |
|
graphs
|
– X axis (abscissa) = categories or values of the X variable
– Y axis (ordinate) = the statistic being presented |
|
bar graphs
|
– for categorical (nominal or ordinal) data
|
|
histograms
|
– for quantitative (interval or ratio) data
|
|
frequency polygons
|
– for quantitative (interval or ratio) data
|
|
normal distribution
|
– a symmetrical “bell-shaped” distribution in which
frequency steadily increases as values approach the midpoint of the distribution |
|
skew
|
the degree to which the distribution is unsymmetrical
|
|
positive skew
|
• most scores are on the low
end of the distribution |
|
negative skew
|
• most scores are on the
high end of the distribution |
|
modes
|
– the number of distinct “peaks”
– one distinct peak = unimodal – two distinct peaks = bimodal – more than two = multimodal |
|
central tendency
|
– the typical score
|
|
mean
|
• arithmetic average of the scores
|
|
median
|
• middle score
|
|
mode
|
the most frequent score
|
|
notation for the mean
|
– μ (mu) for the population mean (usually unknown)
– M for a sample mean |
|
mean is
|
is the balancing point of
a distribution |
|
a deviation
|
is each score minus the
mean (X – M). • The sum of the deviations always equals zero. |
|
The mean is sensitive to outliers
|
– extreme scores that skew the
distribution – the mean is a poor measure of central tendency when there are outliers |
|
the median is not affected by
|
outliers
|
|
what do you do when the mean is sensitive to outliers?
|
find the median
|
|
the grand mean
|
the mean of several
sample means. |
|
when to use the median instead of the mean
|
– If the data include outliers
– If the data include undetermined values – If the distribution is open-ended – If the data are ordinal |
|
the mode is
|
Only measure of central tendency for nominal data
• Otherwise, not a good measure because distributions can have more than one distinct mode or no mode at all |
|
range
|
– upper real limit of the highest score – the lower real
limit of the lowest score – whole numbers: highest score – the lowest score + 1 |
|
interquartile range
|
– the difference between the the first and third
quartiles |
|
quartiles are
|
are values that divide a distribution of
scores into four equal parts first quartile (Q1) divides bottom 25% from top 75% • second quartile (Q2) divides bottom 50% from top 50%; same as the median • third quartile (Q3) divides bottom 75% from top 25% |
|
semi-interquartile range
|
– half the interquartile range
SIQR = (Q3 – Q1)/2 |
|
variance(σ2)
|
the average squared deviation
|
|
standard deviation(σ)
|
– the square root of the variance
|
|
populations vs. samples
|
A sample tends to have less variability than the population from
which it was drawn. |
|
how is a statistic biased?
|
A statistic is biased if, on average, it underestimates
or overestimates its corresponding parameter. |
|
how is a statistic unbiased?
|
Dividing by n – 1 results in an unbiased
estimate of the population variance and standard deviation. – A statistic is unbiased if, on average, it equals its corresponding parameter. |
|
degrees of freedom (df)
|
– number of values that are free to vary in a sample
– for a single sample with one variable, df = n – 1 |
|
what do the mean and standard deviation tell us?
|
The mean (μ) and standard deviation (σ) tells us the position of a score relative to other scores in the population.
|
|
raw scores
|
are the data directly observed from
the measurements |
|
standardized scores
|
indicate the relative position
of the raw scores in a distribution |
|
standardizing changes the scale how?
|
Standardizing changes the scale into units of
variability (i.e., standard deviations) |
|
z-scores are what type of data?
|
interval data
|
|
problems with z-scores
|
– large z-scores still appear small
– a z-score of zero is average – negative values are confusing • Solution: convert z-scores by changing the mean and standard deviation |
|
scatterplot
|
– a two dimensional plot of each subject based on
their scores from two variables |
|
correlation
|
the linear relationship between two variables
|
|
positive correlation
|
as one variable increases, the other variable tends
to increase |
|
negative correlation
|
as one variable increases, the other variable tends
to decrease |
|
correlation direction
|
Whether a correlation is positive or negative
is called its direction |
|
correlation magnitude
|
The accuracy of one variable’s prediction of
the other variable is called the magnitude. |
|
Pearson product moment correlation
coefficient (r) |
– measures the degree to which the scatter plot
forms a straight line – ranges from • –1 (perfect negative correlation) • to 0 (no correlation) • to 1 (perfect positive correlation) – the sign indicates direction – the number indicates magnitude |
|
definitional formula of the pearson r
|
– the average cross-product of the z-scores
|
|
r 2 = coefficient of determination
|
the proportion of variance in one variable (Y) that
can be predicted from the other variable (X). |
|
1 - r 2 = coefficient of nondetermination
|
the proportion of variance in one variable (Y) that
cannot be predicted from the other variable (X). |
|
very strong pearson r means:
|
r is > or equal to .70
r2 is > or equal to .49 |
|
strong pearson r means:
|
r is greater than or equal to .50 - .69
r2 is greater or equal .25 - .48 |
|
moderate pearson r means:
|
r is greater or equal .30 - .49
r2 is greater or equal .09 - .24 |
|
weak pearson r means:
|
r is greater or equal .10 - .29
r 2 is greater or equal .01 - .08 |
|
very weak pearson r means:
|
r is less than or equal to .10
r2 is less than or equal .01 |
|
spurious correlations
|
are correlation coefficients
that are artificially high or low |
|
outliers
|
anyone who is more than 3 standard deviations away from the mean, if so you can remove them from your data but you must report it
|
|
restricted range
|
example: harvard cuts off at 1300 on SATs
|
|
curvilinearity
|
when the relation between two variablesis better described as a curve
|
|
sample size
|
the larger the sample, the more reliable the correlation
|
|
n should be greater or equal to
|
30
|
|
Given any correlation, there are three
possible causal relationships. |
X ------> Y
Y------> X 3rd factor ----> X and Y |
|
in a standard normal distribution:
|
Approx. 68% of the distribution is within 1 σ of μ.
Approx. 95% of the distribution is within 2 σ of μ. Approx. 100% of the distribution is within 3 σ of μ. |
|
percentile rank (PR)
|
– the percentage (%) of scores less than or equal to a
given score (X) rounded to the nearest whole number. |
|
unit normal table
|
lists the proportion
of the standard normal distribution less than and greater than any given z-score |
|
Body p
|
is the larger proportion of the
distribution associated with a given z-score. |
|
Tail p
|
is the smaller proportion of the
distribution associated with a given z-score. |
|
percentile
|
– the score (X) at which a given percentage (%) of
the distribution is equal to or less than that score. |
|
probability (p)
|
– the frequency (f) of an outcome relative to all possible outcomes (N)
– probability = proportion – probability ranges from .00 to 1.00 – assumes random sampling p = f/N |
|
random sampling
|
each outcome or individual has an equal chance of
occurring or being selected from the population |
|
sampling error
|
the discrepancy between a sample statistic and
the population parameter Sampling error causes variability in the sample means. |
|
sampling distribution (of the mean)
|
– the theoretical distribution of sample means if
all possible samples of equal size were selected |
|
Mean of the sampling distribution
distribution |
• Mean of the sampling distribution
– the expected value of any given sample mean – on average, the sample mean will equal the mean of the sampling distribution |
|
standard error
|
– the standard deviation of the sampling distribution
– standard error is a measure of sampling error |
|
as the standard error decreases
|
so does the average discrepancy
between the sample mean and the population mean |
|
central limit theorem rule 1
|
1. the mean of the sampling distribution equals
the population mean thus, on average the sample mean equals the population mean |
|
central limit theorem rule 2
|
the standard error is equal to the population
standard deviation divided by the square root of the sample size Standard error is an index of sampling error. – Thus, as sample size (n) increases, sampling error decreases. – Therefore, the law of large numbers: |
|
law of large numbers
|
The larger the sample, the more likely the sample
mean will equal the population mean |
|
central limit theorem rule 3
|
The shape of the sampling distribution
approaches normal as sample size (n) increases. – The sampling distribution is practically normal when… • the population distribution is normal, or • the sample size (n) is greater than 30 |
|
research hypotheses
|
– a statement that outlines the predicted relationship
between the variables of interest – e.g., rewards for good grades affect students’ IQ |
|
statistical hypotheses
|
– two mutually exclusive mathematical propositions
about the parameter(s) of an unknown population |
|
Null hypothesis (H0)
|
– states the expected value of a parameter (e.g., μ)
if the treatment has no effect ( i.e., the research hypothesis is false) – e.g., “rewards do not affect students’ IQ” would be written like so... H0: μ = 100 where μ is the hypothesized mean of the unknown population |
|
Alternative hypothesis (H1)
|
– states the opposite of the null hypothesis
– e.g., “rewards do affect students’ IQ” would be students IQ written like so... H1: μ ≠ 100 |
|
hypothesis test
|
– procedure for estimating the probability that a
sample was drawn from a population if the null hypothesis is true – test whether the null is false, not whether the alternative is true |
|
deductive falsification
|
• a procedure in which evidence is sought leading to the
conclusion that a hypothesis is incorrect • based on the logic of modus tollens |
|
Modus tollens (denying the consequent)
|
– If A, then B (If there is fire, then there is oxygen.)
– Not B (There is no oxygen.) – Therefore, not A (Therefore, there is no fire.) • Affirming the consequent is not valid. – If A, then B (If there is fire, then there is oxygen.) – B (There is oxygen.) – Therefore, A (Therefore, there is fire.) |
|
Modus tollens applied to hypothesis testing
|
– If the hypothesis is true, then a prediction follows
(If H, then P). – If the prediction is not true (Not P), then the hypothesis is not true (Therefore, not H) – If the prediction is true (P), then stating that the hypothesis is true (Therefore, H) is not valid. • Because we can only falsify a hypothesis, we either reject the null or fail to reject the null |
|
critical region
|
– the extreme values that are
unlikely be obtained if the null is true |
|
Alpha
|
– the probability of obtaining a
sample from the critical region. – in the behavioral sciences, alpha typically equals .05 |
|
the p-value
|
– the probability of selecting a
sample more extreme than the current sample if the null hypothesis is true. |
|
if p < a
|
• then the sample is in the critical region
• reject the null hypothesis • the result is significant |
|
if p > a
|
• then the sample is not in the critical region
• fail to reject the null hypothesis • the result is not significant |
|
two-tailed tests
|
– p and α are divided between the two tails of the
distribution – Used for nondirectional hypotheses |
|
non-directional hypotheses
|
research hypotheses that do not specify the direction
of the relationship between the variables |
|
format for statistical hypotheses for two-tailed tests
|
H0: μ = value and H1: μ not equal value
|
|
one-tailed tests
|
– p and α are in one tail of the distribution
– Used for directional hypotheses |
|
directional hypotheses
|
research hypotheses that specify the direction of the
relationship between the variables • keywords: greater, more, higher, better, increases, less, lower, worse, decreases |
|
format for statistical hypotheses for one-tailed tests
|
H0: μ < value and H1: μ > value
H0: μ > value and H1: μ < value |
|
research hypothesis for one-tailed test
|
Research hypothesis:
– Rewards increase student’s IQ • Statistical hypotheses: H0: μ < 100 H1: μ > 100 • α and p are in right tail |
|
research hypothesis for one-tailed test
|
• Research hypothesis:
– Rewards decrease student’s IQ • Statistical hypotheses: H0: μ > 100 H1: μ < 100 • α and p are in left tail |
|
Type I error
|
– rejecting the null hypothesis when it is true
– alpha ( α) is the probability of a Type I error |
|
Type II error
|
– failing to reject the null hypothesis when it is
false – beta (β) is the probability of committing a Type II error |
|
power
|
1 – β
• the probability of rejecting the null hypothesis if it is false • power should be > .80 |
|
understanding power
|
look at notes
|
|
power increases as
|
the effect size increases.
|
|
effect size
|
is the difference between the population mean
based on H0 (μ0) and the actual population mean (μ1) . |
|
power increases as
|
the standard error decreases
|
|
power increases as
|
the standard error decreases
– the standard deviation decreases – the sample size increases |
|
power increases as
|
α increases
|
|
power increases as
|
α increases
– but increasing alpha also increases the probability of a Type I error |
|
when sample sizes are small
|
When sample sizes are small (n < 30), the
sampling distribution may not be normal |
|
when standard deviation is unknown
|
When σ is unknown, the standard error
cannot be calculated |
|
student's t
|
Uses the sample standard deviation (s) to
estimate the standard error |
|
using student's t
|
When you use the estimated standard error
to standardize the sample mean, the statistic is called t instead of z. |
|
t distribution
|
a symmetrical distribution with a mean of zero
that gets wider and flatter as df decrease – When df = infinity, the distribution is perfectly normal and t = z. |
|
if |t| > tc then p < α
|
then reject the null hypothesis
|
|
Significance does not indicate ...
|
Significance does not indicate effect size.
– Remember, effect size is the difference between the actual population mean and the population mean based on the null hypothesis. – Small effects can be significant if sample size is large enough because larger samples have less sampling error and thus more power. |
|
estimating effect size
|
If, on average, the sample mean equals the
actual population mean, then M – μ is an estimate of the effect size (where μ is based on the null hypothesis) |
|
cohen's d
|
is the standardized estimate of
effect size |
|
large effect size (cohen's d)
|
greater than .80
|
|
medium effect size (cohen's d)
|
.20 - .80
|
|
small effect size (cohen's d)
|
less than .20
|
|
related samples t test
|
Variation of the single sample t test used for:
– within-subjects - matched-subjects • Null hypothesis states that, on average, the difference between the scores is zero. |
|
within-subjects (repeated measures) design
|
• a research design in which the individuals of a single
sample are measured more than once |
|
matched-subjects design
|
• a research design in which individuals from one sample
are paired with individuals from another sample based on one or more variables that the researcher wants to control |
|
independent samples t test
|
Used for between-subjects designs
• The samples are drawn from two theoretical populations. • Null hypothesis states that the difference between the means of the two populations is zero. |
|
between-subjects design
|
a research design in which the means of different
unmatched samples are compared (e.g., an experimental sample and a control sample) |
|
Sampling distribution of the difference
(between the means) |
– the distribution of all possible differences between the means of all possible samples of a given size
|
|
the distribution of all possible differences between
– the mean of this distribution = 1 - 2 • if the null hypothesis is true, then the mean of this distribution is equal to zero – the standard deviation of this distribution is called the standard error of the difference |
Sampling distribution of the difference
(between the means) |
|
Pairwise error rate (α)
|
– the probability of committing a Type I error
when comparing two means |
|
Experimentwise (familywise) error rate
|
– the total probability of committing a Type I
error when performing more than one pairwise comparison • equal to 1 – (1 – α)c, where c = no. of comparisons • probability of committing at least one Type I error increases as the number of comparisons increase |
|
Analysis of variance (ANOVA)
|
Analysis of variance (ANOVA)
– determines whether the variance between groups significantly exceeds the variance within groups due to error |
|
Within-group variance (σ2
w) |
Within-group variance (σ2
w) – the total variance within each group – variability in subjects responses due to error • extraneous variables other than the treatment or independent variable that affect subjects’ scores |
|
Between-group variance (σ2
b) |
– the variance among the means of each group
|
|
treatment variance (σ2t)
|
refers to the variability
in people’s responses due to the treatment or independent variable only |
|
Anova
|
If treatment variance = 0, then F = 1
• We reject the null if F is significantly greater than 1. |
|
F distribution
|
an unsymmetrical distribution that becomes more
and more positively skewed as the degrees of freedom within and the degrees of freedom between decrease |
|
total sum of squares (SST)
|
is the SS of all the scores
across all groups |
|
total degrees of freedom
|
(dfT) is the df of all the
scores across all groups |
|
Factorial design
|
is a research design in which
the effects of more than one factor on the dependent variable are examined. |
|
factor
|
A factor is the same as the treatment or independent
variable, or any categorical variable (e.g., gender) that may be related to the dependent variable. |
|
ANOVA is used to
|
ANOVA is used to analyze factorial designs
– A two-way ANOVA analyzes two factors, a threeway ANOVA analyzes three factors, etc. |
|
main effect of B
|
– What is the effect of B ignoring A?
– Does the mean time following the “persist” instructions differ from the mean time following the “do not persist” instructions? |
|
Main effect of A
|
– What is the effect of A ignoring B?
– Does the mean time for low SE people differ from the mean time for high SE people? |
|
A x B interaction
|
– Does the effect of B depend on A?
– Does mean time difference between instructions for low SE people differ from the mean time difference between instructions for high SE people? |
|
Non-parallel lines usually
indicate ... |
Non-parallel lines usually
indicate an interaction. |
|
Parallel lines indicate...
|
Parallel lines indicate no
interaction |
|
graphing factorial designs
|
look over in notes
|
|
Two-way ANOVA
|
Two-way Analysis of Variance
• Also called a kA x kB ANOVA, where... kA is the number of levels of factor A kB is the number of levels of factor B kAkB = # of groups/conditions in the experiment – For example • a 2 x 2 ANOVA has 4 groups • a 2 x 3 ANOVA has 6 groups • a 3 x 3 ANOVA has 9 groups • a 3 x 4 ANOVA has 12 groups, etc. |
|
When reporting the results of an two-way ANOVA...
|
When reporting the results of an two-way
ANOVA, state whether each main effect and the interaction were significant or not. • Example: – The 2 x 3 ANOVA revealed a significant main effect for anger, F (1, 54) = 5.00, p < .05. The main effect for cues was not significant, F (2, 54) = 3.00, p > .05. The Anger x Cues interaction was significant, F (2, 54) = 6.00, p < .05. |
|
Parametric tests
|
Parametric tests
– inferential statistics that assume certain characteristics of the population are true – t tests and ANOVAs are parametric |
|
Nonparametric tests
|
Nonparametric tests
– inferential statistics that make few or no assumptions about the characteristics of the population |
|
Assumptions of parametric tests
|
– normality
• the population distribution is normal – homogeneity of variance • when comparing two or more groups, the population variances of each group are equal – data are quantitative (interval or ratio) |
|
Why use parametric tests?
|
– Parametric tests have more power than
nonparametric tests. – Generally, if sample sizes are large (N > 30), t tests and ANOVAs are robust against violations of normality and homogeneity of variance. |
|
Robust
|
means that the statistical tests still perform well,
even though certain assumptions may not be true. |
|
Why use nonparametric tests?
|
– If parametric assumptions have been violated and
sample sizes are small (N < 30), then parametric tests may lead to an inflated probability of a Type I error. – If the data are categorical (nominal or ordinal), parametric tests should not be used. |
|
Chi-square Goodness of Fit Test
|
• nonparametric test of whether the observed
frequencies in each category conform to (i.e., “fit”) the frequencies expected based on the null hypothesis. |
|
fo
|
is the observed frequency
|
|
fe
|
is the expected frequency
|
|
Is the x2 significant?
|
the distribution is
positively skewed, and becomes flatter and longer as degrees of freedom increase. |
|
Chi-square Test for Independence
|
• nonparametric test of whether the observed
frequencies associated with two variables are independent (i.e., unrelated). |
|
Writing in APA Style
for chi-square test for independence |
Extroverts were significantly more likely to conform
than introverts, x2(1, N = 60) = 4.15, p < .05. |