What are the 2 types of statistics?

Descriptive
Inferential 

Define Descriptive statistics.

Allows us to 'describe.' Involves:
measure of central tendencies measure of variability crosstabs/Chi Square Correlations 

What are the measures of central tendencies? Define each.

Mean  average
median  the 'middle' element when the data is lined up in order of magnitude mode  the data element that occurs most frequently 

What is the best measure of central tendency?

mean


What are the Levels of Data? Define them.

nominal  has no order; only gives names or labels to categories; e.g. sex being male or female
ordinal  has order but the interval b/w measurements has no meaning; e.g. Likert scale interval  meaningful intervals b/w measurements but no true zero ratio  highest level of measurement; has a starting point (zero) with meaningful intervals b/w measurements 

What is probability?

the chance that a particular event or outcome will occur
values range from 0 to 1 percentages range from 0 to 100% 

What is standard deviation (SD)?

a descriptive statistic measuring the degree of variability in a set of scores
square root of the variance 

Define Standard Error of the Mean (SEM)?

SEM = SD/square root of N
where N=sample size when SEM is large there is more variability in the sample means 

What is a Zscore?

the number of SDs a score of interest lies from the mean
Z = (xxbar)/SD x=score xbar=mean 

What does p<.05 mean?

5 times out of 100 the difference is due to chance
it is a significance level 

What is a type I error?

known as alpha (usually .05)
reject the null hypothesis when it is true it is a false positive 

What is a type II error?

beta (usually .2)
accept the null hypothesis when it is false it is a false negative 

What are inferential statistics & give types of tests used.

explores the relationship b/w variables & the generalizability from sample to population
tests: ttests ANOVA & ANCOVA Linear & logistic regression 

characteristics of a positively skewed distribution

mean>median>mode


characteristics of a negatively skewed distribution

mode>median>mean


define parametric tests

DV is I/R
IV is nominal or I/R uses Ttest, ANOVA, ANCOVA or regression 

define nonparametric tests

Measurement level of variables are nominal or ordinal
uses ChiSquare Test or crosstabs 

True or False
In a normal distribution, the mean, median, and mode are equal. 
True


What is variability?

measures how dispursed the scores are in a distribution
not a reliable measurement as extreme scores can distort the range 

What is variance?

average of the squared differences of the mean


what is standard deviation?

the square root of the variance (where variance is the average of the squared differences from the mean)
SD is a descriptive statistic that measures the degree of variability in a set of scores 

What is 1SD from the mean?

34%
if you want the range + or  1, then it's 68% 

What is 2SD from the mean?

47.5%
if you want the range + or  1, then it's 95% 

What is 3SD from the mean?

98%
if you want the range + or  1, then it's 49% 

What level of data are these:
weight sex: male/female Likert scale 
weight  ratio
sex: male/female  nominal Likert scale  ordinal 

When do we use the Chi Square test? What kind of data does it use?

We use Chi Squared when we want to know if the expected number differs significantly from the observed number.
uses nominal or ordinal data 

Characteristics of Chi squared

nonparametric test that measures if the expected number differs significantly from the observed
char. by one parameter called degrees of freedom measures are independent of each other always positive 

Define simple correlation

Pearson's R
relationship between two variables ranges between 1 and +1 the absolute value of the coefficient reflects the strength of the correlation 

What does r=.8 mean?

strong positive correlation


What does r=.1 mean?

nothing or very weak correlation


What is shared variance?

the square of the correlation coefficient
it is a measure of the amount of variance shared by two variables eg r=.20, so the shared variance is r squared, which = .04 or 4% another way to say it is the IV accounts for 4% of the variance of the DV 

assumptions of correlations

can be calculated with all levels of data
sample must represent the population relationship of X & Y must be linear assume homoscedasticity 

What is homoscedasticity?

variance homogeneity
for every value of X, the distribution of Y scores must have approximately equal variability 

What are the Rules of Thumb for strength of correlations?
e.g. 0.40.6 means what? 
0.8 to 1.0 (very strong relationship)
0.6 to 0.8 (strong relationship) 0.4 to 0.6 (moderate relationship) 0.2 to 0.4 (weak relationship) 0.0 to 0.2 No to weak relationship) 

Which group of people are we talking about when we use the following terms?
statistic parameter 
statistic > sample
parameter > population 

Define nonparametric.

categorical or nonnormal data
Measurement level of variables are nominal or ordinal ChiSquare Test Crosstabs 

Define parametric.

based on normality of data
DV is I/R IV is nominal or I/R Ttest, ANOVA and ANCOVA Regression 

If your correlation statistic is .702, what is the strength of the correlation?

strong negative correlation


How much variance is shared between 2 variables with an r= .702?

r2 = .702 *.702 = .493 or 49.3%
so the IV accounts for 49.3% of the variance of the DV, OR the variance shared between the 2 variables is 49.3% 

Z=2. What does this mean?

This is a Zscore, the number of SDs the score lies away from the mean.
Z=2 means the score of interest is 2SD away from the mean 

another name for significance.

Pvalue


What type of error is this or is it the correct decision?
The null hypothesis is true but the test rejects it 
Type I error
you are rejecting the null hypothesis when it's true (false negative) 

What type of error is this or is it the correct decision?
The null hypothesis is false but the test accepts it 
Type II error
you are accepting the null hypothesis when it's false (false negative) 

Why use a Ttest? What are some characteristics of it?

to find the difference between 2 means in 2 groups
IV is nominal DV (or outcome) is continuous (ordinal, I/R) assume 2 mutually exclusive groups equal variances are assumed 

What is the purpose of ANOVA?

to find if a difference between the means of 2 or more groups exists


What is the purpose of a PostHoc test?

to define the difference between groups


How many degrees of freedom in a Ttest with 24 people in each group?

46
df = (24 people x 2 groups)  2 

For a Ttest, what is the scale of measurement for the dependent variable?

ordinal or I/R > treated as continuous


in a Ttest, are the distributions of the 2 groups homogeneous?

yes, homogeneity of variance


what are the 2 types of Ttests?

independent  2 independent groups that are mutually exclusive
paired  one group with 2 measures over time 

A Levene's test has the following results:
F=1.153 Sig.=.286 What does this mean? 
the results are not significant, there is no difference between the groups therefore we assume equal variances.


A Levene's test has the following results:
F=6.228 Sig.=.002 What does this mean? 
significant results, there are differences between the groups therefore equal variances can not be assumed


In an ANOVA test, what type of data is used for the IV?

nominal (categorical)


In an ANOVA test, what type of data is used for the DV?

ordinal, I/R (continuous)


What is the difference between a Oneway and a Twoway ANOVA?

A oneway ANOVA has one IV and one DV.
A two way has two IV and one DV 

What is the purpose of a Ttest?

It assesses the difference between the means of 2 groups.


In a Ttest, what kind of variable is the IV? the DV?

IV is nominal
DV is ordinal, interval or ratio (continuous) 

Name and define the 2 types of Ttests.

independent Ttest  looking at the difference in means of 2 mutually exclusive groups
paired Ttest  looking at the difference of the means of one group with 2 measures over time. (e.g. pre & post test) 

What are the assumptions of a Ttest?

IV is categorical (nominal)
2 mutually exclusive groups normal distribution of the DV homogeneity of variance (equal variances are assumed, i.e. Levene's test) 

Define degrees of freedom.

the number of categories or classes being tested minus 1


What is the purpose of ANOVA?

tests the difference of means among more than 2 groups


What level of data is the IV in an ANOVA study? DV?

IV is nominal
DV is continuous (I/R) 

What are the 'assumptions' of an ANOVA?

same as the Ttest
groups are mutually exclusive normal distribution of DV homogeneity of variance. 

Define mean square.

the average amount of variance per degree of freedom


The Fstatistic correlates with what test? The tstatistic correlates with what test?

Fstatistic is the result of ANOVA
tstatistic is the result of a Ttest 

What does the Fstatistic mean in an ANOVA?

in order to reject the null hypothesis, the Fstatistic must meet or exceed the critical value (that is looked up on a chart in the back of Munro)


True or False
The Fstatistic in an ANOVA can either be positive or negative. 
FALSE
The Fstatistic is always positive b/c we are only testing to see if there is a difference b/w groups, not the direction of the difference. 

True or False
the ANOVA produces only one critical Fvalue and it is always positive. 
TRUE


True or False
An ANOVA tells you there are differences b/w groups and where they are. 
FALSE
ANOVA only tells you there are differences. A posthoc test tells you where those differences are. 

How is F derived in an ANOVA?

F = MS(between)/MS(within)
where MS=mean square 

What is an ANCOVA test?

it is an ANOVA but control for extraneous variable.


What are the assumptions of an ANCOVA?

the same as an ANOVA:
groups should be mutually exclusive homogeneity of variances DV should be normally distributed the covariate should be continuous (age) 2 additional assumptions are: the covariate & DV must show a linear relationship directrion & strength of the relationship b/w DV and covariate must be similar in each group>homogeneity of regression 

What are the types of regression?

simple  one IV used to predict a DV
multiple  multiple IVs used to predict a DV Logistic  used when DV is categorical (nominal data) in nature 

Define the components of this equation:
Y = B0 + B1X1 + E 
Y = the predicted score or the DV (outcome)
B0 = constant (Yintercept on a line graph) B1 = regression coefficient, representing the amt Y changes when the IV (x) changes by one unit E = error 

When do we use regression?

It's used to make predictions:
when the relationship b/w 2 variable is perfectly linear, knowledge of value of one variable allows you to predict the value of another variable with accuracy 

The regression line is also known as what?

The "line of best fit."


True or False
Regression is the line of best fit therefore it can curve through the scatterplot to create the best fit. 
FALSE
It is the best linear representation of the data 

Characteristics of the regression line.

It is the line of best fit.
The line passes through the exact center of the data on a scatterplot. The distance b/w the point (value) & the line is the easurement error The regression line is the best line with the least amount of error. 

True or False
The regression line is the best line with the least amount of error. 
TRUE


In a regression, define the following:
R squared Fstatistic Tstatistic 
R squared is shared variance (it is the correlation coefficient squared)
Fstatistic is the same as ANOVA, the overall significance of the model. Tstatistic is the significance of each IV 

In a regression equation, what does the yintercept represent? the Y?

Yintercept is the constant
Y is what you are solving for, the DV or the outcome 

What is the purpose of Logit regression?

It uses MLE (maximum likelihood estimation) to transform the probability of an event occurring into it's odds


Define Odds Ratio.

ratio of 2 probabilities; the probability of the event occurring versus the probability that it will not occur


What is a cohort study?
What is a case control study? 
both are epidemiological studies.
The cohort studies look at relative risk, and work from treatment to outcome. The case control studies look at the odds ratio and work from outcome to treatment. 

Which study does not obtain relative risk directly, cohort or case control studies?

case control studies
case control studies obtain the odds ratio which is then used to estimate the relative risk. It tends to overestimate it. 

True or False
Logistic regression is only used case control studies to produce an odds ratio. 
FALSE
It is used in both cohort and case control studies. It produces an odds ratio which is often interpreted as relative risk. 

Define odds.

another way of presenting probability
the probability of occurrence over the probability of nonoccurrence 

Define odds ratio

comparing the odds of 2 groups
e.g. the odds of rolling a six if female group 1  odds of rolling a 6 group 2  odds of being female 

Define probability, odds, and odds ratio.

probability  measure of likelihood of an event happening
odds  the probability of occurrence over the probability of nonoccurrence odds ratio  comparing the odds of 2 groups 

Which study uses odds ratio, cohort or case control?

case control


why use odds ratio?

provides an estimate for the relationship b/w a binary variable (1 & O or nominal data)


What is relative risk?

the risk given one condition versus the risk given the other condition.
A more direct method of calculating 'odds' (for lack of a better word) 

True or False
the odds ratio is an accurate estimate of relative risk. 
FALSE
it is at least equivalent to but often overestimates relative risk. 

Interpret this.
OR= 2.53, 95% CI: 1.66  3.55 
positive (95% CI: 1.66  3.55)
two and a half times more likely to have the outcome (significant b/c CI doesn't include 1.00, therefore equal odds). 

Interpret this.
OR= 0.60, 95% CI: 0.26  0.92 
negative (95% CI: 0.26  0.92)
40% less likely to have the outcome (1  0.60). significant b/c CI doesn't include 1.00, therefore equal odds. 

In logit, what does the Hosmer and Lemeshow Test tell us?

That is the 'goodness of fit' test. If the test is not significant (p>.05), then the data fits the model.


True or False
A raw score tends to overestimate an adjusted score. 
TRUE


What is ethnography?

study of culture
Describes & analyzes aspects of ways of life of a particular culture, subculture, or subculture groups 

4 types of qualitative research designs.

phenomenology
grounded theory ethnography historical research 

What is phenomenology?

to describe the 'lived experience' of study participants


What is grounded theory?

how people deal with a phenomenon over time
explores social processes with the goal of developing a theory 

What is historical research?

examines event of the past
