Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
102 Cards in this Set
- Front
- Back
What are the 2 types of statistics?
|
Descriptive
Inferential |
|
Define Descriptive statistics.
|
Allows us to 'describe.' Involves:
measure of central tendencies measure of variability crosstabs/Chi Square Correlations |
|
What are the measures of central tendencies? Define each.
|
Mean - average
median - the 'middle' element when the data is lined up in order of magnitude mode - the data element that occurs most frequently |
|
What is the best measure of central tendency?
|
mean
|
|
What are the Levels of Data? Define them.
|
nominal - has no order; only gives names or labels to categories; e.g. sex being male or female
ordinal - has order but the interval b/w measurements has no meaning; e.g. Likert scale interval - meaningful intervals b/w measurements but no true zero ratio - highest level of measurement; has a starting point (zero) with meaningful intervals b/w measurements |
|
What is probability?
|
the chance that a particular event or outcome will occur
values range from 0 to 1 percentages range from 0 to 100% |
|
What is standard deviation (SD)?
|
a descriptive statistic measuring the degree of variability in a set of scores
square root of the variance |
|
Define Standard Error of the Mean (SEM)?
|
SEM = SD/square root of N
where N=sample size when SEM is large there is more variability in the sample means |
|
What is a Z-score?
|
the number of SDs a score of interest lies from the mean
Z = (x-xbar)/SD x=score xbar=mean |
|
What does p<.05 mean?
|
5 times out of 100 the difference is due to chance
it is a significance level |
|
What is a type I error?
|
known as alpha (usually .05)
reject the null hypothesis when it is true it is a false positive |
|
What is a type II error?
|
beta (usually .2)
accept the null hypothesis when it is false it is a false negative |
|
What are inferential statistics & give types of tests used.
|
explores the relationship b/w variables & the generalizability from sample to population
tests: t-tests ANOVA & ANCOVA Linear & logistic regression |
|
characteristics of a positively skewed distribution
|
mean>median>mode
|
|
characteristics of a negatively skewed distribution
|
mode>median>mean
|
|
define parametric tests
|
DV is I/R
IV is nominal or I/R uses T-test, ANOVA, ANCOVA or regression |
|
define non-parametric tests
|
Measurement level of variables are nominal or ordinal
uses Chi-Square Test or crosstabs |
|
True or False
In a normal distribution, the mean, median, and mode are equal. |
True
|
|
What is variability?
|
measures how dispursed the scores are in a distribution
not a reliable measurement as extreme scores can distort the range |
|
What is variance?
|
average of the squared differences of the mean
|
|
what is standard deviation?
|
the square root of the variance (where variance is the average of the squared differences from the mean)
SD is a descriptive statistic that measures the degree of variability in a set of scores |
|
What is 1SD from the mean?
|
34%
if you want the range + or - 1, then it's 68% |
|
What is 2SD from the mean?
|
47.5%
if you want the range + or - 1, then it's 95% |
|
What is 3SD from the mean?
|
98%
if you want the range + or - 1, then it's 49% |
|
What level of data are these:
weight sex: male/female Likert scale |
weight - ratio
sex: male/female - nominal Likert scale - ordinal |
|
When do we use the Chi Square test? What kind of data does it use?
|
We use Chi Squared when we want to know if the expected number differs significantly from the observed number.
uses nominal or ordinal data |
|
Characteristics of Chi squared
|
non-parametric test that measures if the expected number differs significantly from the observed
char. by one parameter called degrees of freedom measures are independent of each other always positive |
|
Define simple correlation
|
Pearson's R
relationship between two variables ranges between -1 and +1 the absolute value of the coefficient reflects the strength of the correlation |
|
What does r=.8 mean?
|
strong positive correlation
|
|
What does r=.1 mean?
|
nothing or very weak correlation
|
|
What is shared variance?
|
the square of the correlation coefficient
it is a measure of the amount of variance shared by two variables eg r=.20, so the shared variance is r squared, which = .04 or 4% another way to say it is the IV accounts for 4% of the variance of the DV |
|
assumptions of correlations
|
can be calculated with all levels of data
sample must represent the population relationship of X & Y must be linear assume homoscedasticity |
|
What is homoscedasticity?
|
variance homogeneity
for every value of X, the distribution of Y scores must have approximately equal variability |
|
What are the Rules of Thumb for strength of correlations?
e.g. 0.4-0.6 means what? |
0.8 to 1.0 (very strong relationship)
0.6 to 0.8 (strong relationship) 0.4 to 0.6 (moderate relationship) 0.2 to 0.4 (weak relationship) 0.0 to 0.2 No to weak relationship) |
|
Which group of people are we talking about when we use the following terms?
statistic parameter |
statistic ---> sample
parameter ---> population |
|
Define non-parametric.
|
categorical or non-normal data
Measurement level of variables are nominal or ordinal Chi-Square Test Crosstabs |
|
Define parametric.
|
based on normality of data
DV is I/R IV is nominal or I/R T-test, ANOVA and ANCOVA Regression |
|
If your correlation statistic is -.702, what is the strength of the correlation?
|
strong negative correlation
|
|
How much variance is shared between 2 variables with an r= .702?
|
r2 = .702 *.702 = .493 or 49.3%
so the IV accounts for 49.3% of the variance of the DV, OR the variance shared between the 2 variables is 49.3% |
|
Z=2. What does this mean?
|
This is a Z-score, the number of SDs the score lies away from the mean.
Z=2 means the score of interest is 2SD away from the mean |
|
another name for significance.
|
P-value
|
|
What type of error is this or is it the correct decision?
The null hypothesis is true but the test rejects it |
Type I error
you are rejecting the null hypothesis when it's true (false negative) |
|
What type of error is this or is it the correct decision?
The null hypothesis is false but the test accepts it |
Type II error
you are accepting the null hypothesis when it's false (false negative) |
|
Why use a T-test? What are some characteristics of it?
|
to find the difference between 2 means in 2 groups
IV is nominal DV (or outcome) is continuous (ordinal, I/R) assume 2 mutually exclusive groups equal variances are assumed |
|
What is the purpose of ANOVA?
|
to find if a difference between the means of 2 or more groups exists
|
|
What is the purpose of a Post-Hoc test?
|
to define the difference between groups
|
|
How many degrees of freedom in a T-test with 24 people in each group?
|
46
df = (24 people x 2 groups) - 2 |
|
For a T-test, what is the scale of measurement for the dependent variable?
|
ordinal or I/R --> treated as continuous
|
|
in a T-test, are the distributions of the 2 groups homogeneous?
|
yes, homogeneity of variance
|
|
what are the 2 types of T-tests?
|
independent - 2 independent groups that are mutually exclusive
paired - one group with 2 measures over time |
|
A Levene's test has the following results:
F=1.153 Sig.=.286 What does this mean? |
the results are not significant, there is no difference between the groups therefore we assume equal variances.
|
|
A Levene's test has the following results:
F=6.228 Sig.=.002 What does this mean? |
significant results, there are differences between the groups therefore equal variances can not be assumed
|
|
In an ANOVA test, what type of data is used for the IV?
|
nominal (categorical)
|
|
In an ANOVA test, what type of data is used for the DV?
|
ordinal, I/R (continuous)
|
|
What is the difference between a One-way and a Two-way ANOVA?
|
A one-way ANOVA has one IV and one DV.
A two way has two IV and one DV |
|
What is the purpose of a T-test?
|
It assesses the difference between the means of 2 groups.
|
|
In a T-test, what kind of variable is the IV? the DV?
|
IV is nominal
DV is ordinal, interval or ratio (continuous) |
|
Name and define the 2 types of T-tests.
|
independent T-test - looking at the difference in means of 2 mutually exclusive groups
paired T-test - looking at the difference of the means of one group with 2 measures over time. (e.g. pre & post test) |
|
What are the assumptions of a T-test?
|
IV is categorical (nominal)
2 mutually exclusive groups normal distribution of the DV homogeneity of variance (equal variances are assumed, i.e. Levene's test) |
|
Define degrees of freedom.
|
the number of categories or classes being tested minus 1
|
|
What is the purpose of ANOVA?
|
tests the difference of means among more than 2 groups
|
|
What level of data is the IV in an ANOVA study? DV?
|
IV is nominal
DV is continuous (I/R) |
|
What are the 'assumptions' of an ANOVA?
|
same as the T-test
groups are mutually exclusive normal distribution of DV homogeneity of variance. |
|
Define mean square.
|
the average amount of variance per degree of freedom
|
|
The F-statistic correlates with what test? The t-statistic correlates with what test?
|
F-statistic is the result of ANOVA
t-statistic is the result of a T-test |
|
What does the F-statistic mean in an ANOVA?
|
in order to reject the null hypothesis, the F-statistic must meet or exceed the critical value (that is looked up on a chart in the back of Munro)
|
|
True or False
The F-statistic in an ANOVA can either be positive or negative. |
FALSE
The F-statistic is always positive b/c we are only testing to see if there is a difference b/w groups, not the direction of the difference. |
|
True or False
the ANOVA produces only one critical F-value and it is always positive. |
TRUE
|
|
True or False
An ANOVA tells you there are differences b/w groups and where they are. |
FALSE
ANOVA only tells you there are differences. A post-hoc test tells you where those differences are. |
|
How is F derived in an ANOVA?
|
F = MS(between)/MS(within)
where MS=mean square |
|
What is an ANCOVA test?
|
it is an ANOVA but control for extraneous variable.
|
|
What are the assumptions of an ANCOVA?
|
the same as an ANOVA:
groups should be mutually exclusive homogeneity of variances DV should be normally distributed the covariate should be continuous (age) 2 additional assumptions are: the covariate & DV must show a linear relationship directrion & strength of the relationship b/w DV and covariate must be similar in each group-->homogeneity of regression |
|
What are the types of regression?
|
simple - one IV used to predict a DV
multiple - multiple IVs used to predict a DV Logistic - used when DV is categorical (nominal data) in nature |
|
Define the components of this equation:
Y = B0 + B1X1 + E |
Y = the predicted score or the DV (outcome)
B0 = constant (Y-intercept on a line graph) B1 = regression coefficient, representing the amt Y changes when the IV (x) changes by one unit E = error |
|
When do we use regression?
|
It's used to make predictions:
when the relationship b/w 2 variable is perfectly linear, knowledge of value of one variable allows you to predict the value of another variable with accuracy |
|
The regression line is also known as what?
|
The "line of best fit."
|
|
True or False
Regression is the line of best fit therefore it can curve through the scatterplot to create the best fit. |
FALSE
It is the best linear representation of the data |
|
Characteristics of the regression line.
|
It is the line of best fit.
The line passes through the exact center of the data on a scatterplot. The distance b/w the point (value) & the line is the easurement error The regression line is the best line with the least amount of error. |
|
True or False
The regression line is the best line with the least amount of error. |
TRUE
|
|
In a regression, define the following:
R squared F-statistic T-statistic |
R squared is shared variance (it is the correlation coefficient squared)
F-statistic is the same as ANOVA, the overall significance of the model. T-statistic is the significance of each IV |
|
In a regression equation, what does the y-intercept represent? the Y?
|
Y-intercept is the constant
Y is what you are solving for, the DV or the outcome |
|
What is the purpose of Logit regression?
|
It uses MLE (maximum likelihood estimation) to transform the probability of an event occurring into it's odds
|
|
Define Odds Ratio.
|
ratio of 2 probabilities; the probability of the event occurring versus the probability that it will not occur
|
|
What is a cohort study?
What is a case control study? |
both are epidemiological studies.
The cohort studies look at relative risk, and work from treatment to outcome. The case control studies look at the odds ratio and work from outcome to treatment. |
|
Which study does not obtain relative risk directly, cohort or case control studies?
|
case control studies
case control studies obtain the odds ratio which is then used to estimate the relative risk. It tends to overestimate it. |
|
True or False
Logistic regression is only used case control studies to produce an odds ratio. |
FALSE
It is used in both cohort and case control studies. It produces an odds ratio which is often interpreted as relative risk. |
|
Define odds.
|
another way of presenting probability
the probability of occurrence over the probability of non-occurrence |
|
Define odds ratio
|
comparing the odds of 2 groups
e.g. the odds of rolling a six if female group 1 - odds of rolling a 6 group 2 - odds of being female |
|
Define probability, odds, and odds ratio.
|
probability - measure of likelihood of an event happening
odds - the probability of occurrence over the probability of non-occurrence odds ratio - comparing the odds of 2 groups |
|
Which study uses odds ratio, cohort or case control?
|
case control
|
|
why use odds ratio?
|
provides an estimate for the relationship b/w a binary variable (1 & O or nominal data)
|
|
What is relative risk?
|
the risk given one condition versus the risk given the other condition.
A more direct method of calculating 'odds' (for lack of a better word) |
|
True or False
the odds ratio is an accurate estimate of relative risk. |
FALSE
it is at least equivalent to but often overestimates relative risk. |
|
Interpret this.
OR= 2.53, 95% CI: 1.66 - 3.55 |
positive (95% CI: 1.66 - 3.55)
two and a half times more likely to have the outcome (significant b/c CI doesn't include 1.00, therefore equal odds). |
|
Interpret this.
OR= 0.60, 95% CI: 0.26 - 0.92 |
negative (95% CI: 0.26 - 0.92)
40% less likely to have the outcome (1 - 0.60). significant b/c CI doesn't include 1.00, therefore equal odds. |
|
In logit, what does the Hosmer and Lemeshow Test tell us?
|
That is the 'goodness of fit' test. If the test is not significant (p>.05), then the data fits the model.
|
|
True or False
A raw score tends to overestimate an adjusted score. |
TRUE
|
|
What is ethnography?
|
study of culture
Describes & analyzes aspects of ways of life of a particular culture, subculture, or subculture groups |
|
4 types of qualitative research designs.
|
phenomenology
grounded theory ethnography historical research |
|
What is phenomenology?
|
to describe the 'lived experience' of study participants
|
|
What is grounded theory?
|
how people deal with a phenomenon over time
explores social processes with the goal of developing a theory |
|
What is historical research?
|
examines event of the past
|