Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
53 Cards in this Set
- Front
- Back
how are conditional distributions calculated |
column observed value divided by column total |
|
what do you need to do the chi squared test |
random sample classified into categorical data at least 20% is >= 5 and if any are < 1 |
|
how do you find expected counts |
(row total x column total)/Grand Total |
|
for a 2x2 table, what do all expected counts have to be |
greater than 5 |
|
how do you perform chi squared test |
sum of((observed - expected)^2/expected) |
|
cautions of chi squared test |
poor sampling cannot confirm cause and effect |
|
what are the three ways that chi squared test can collect data |
single population multiple populations experiment |
|
what is Ho and Ha of single population chi squared test |
Ho no association Ha there is an association |
|
what does the chi squared test measure |
how strong the association between two variables are |
|
what is the Ho and Ha of multiple populations chi squared test |
Ho no association between them all Ha at least one of them has an association |
|
what is the Ho and Ha of experiment for chi squared test |
Ho treatment distributions are the same Ha at least one differs |
|
does the chi squared test differ between single population, multi population and experiments |
nope |
|
how to calculate degree of freedom |
(r-1)(c-1) |
|
what are the three principles of experiment design |
control of lurking variables random repeatable |
|
what to look for when interpreting scatterplots |
form: linear, curved, cluster, no pattern direction: +ve -ve no direction strength outliers |
|
what is a correlation coefficient |
describes a relation between two quantitative variables |
|
between what values do correlation coefficients lay |
-1 = very strong neg relation 1 = very strong pos relation 0 = no association |
|
what is the condition that needs use a regression line
|
MUST BE LINEAR
|
|
what is a regression line |
best fit line |
|
what form does a regression line take |
y = a + bx
|
|
what is A in the regression line |
a = y intercept a = y - bx |
|
what is b in the regression line |
b =slope b = r (sd y/ sd x) |
|
what point does the regression line always pass through |
x(bar) , y(bar) |
|
what is the regression line used for |
to see the difference between what we expect and what we obtained |
|
what is the symbol for coefficient of determination |
r^2 |
|
How do you test for statistical significance |
1) come up with null/alternative hypothesis 2) calculate R from data 3) compare R to decision points if abs(r) > decision point reject Ho |
|
What is Ho |
null hypothesis aka no relation |
|
how to calculate R for statistical significance |
r= 1/(n-1)sumof ((Xi- Avex)/Sx)(Yi-Avey)/Sy) sx / sy = standard deviation |
|
what to look for in a regression analysis |
outliers linearity - must be linear homoscedasticity - patter of residual graph normality - follows normal distribution |
|
what are residuals |
the distance between each point and regression line |
|
what do residuals sum to |
0 |
|
what is the purpose of a residual plot |
to magnify errors check for good pattern only for checkin |
|
what are some qualities of side by side barcharts |
each set of bars = 100% x = explanatory variable y = response variable |
|
conditions to evaluate association |
temporality - cause has to come before effect plausibility - has to make sense consistency association is strong higher doses associated with stronger responses |
|
what are the assigning probabilities |
classical method relative frequency (empirical) method subject method |
|
what is the classical method of probability |
based on assumption of equally likely outcomes |
|
what is the relative frequency (empirical) method |
based on experiment or historical data |
|
what is the subject method |
based on judgement |
|
what is the sample space |
list of all possible outcomes |
|
what does the sample space must sum to |
1 |
|
what is a continuous random variable |
any value in interval on real line ie time, mass |
|
what is a discrete random variable |
take only whole numbers |
|
what is central limit theorem |
with large enough sample N x(bar) sampling distribution for x(bar) will be the same as population mean. |
|
what does a larger sample size help do |
allows central limit theorem to work helps over come extreme skewness |
|
what happens as the sample size increases |
distribution becomes more normal |
|
how to find sample proportion |
p(hat) = count of successes in sample/# of observed counts successes = desired criteria. |
|
how to find standard error of mean |
sd/ root(n) |
|
what does the number of successes and failures have to add to to make confidence interval of p |
15 |
|
what is the theoretical sample distribution of p(hat) |
distribution of all possible proportions from same sample size taken from same population |
|
what is the approximate sample distribution of p(hat) |
approximated distribution of proportions if tested many times. |
|
what happens to sampling distribution as n goes up |
sd goes down but center does not change |
|
what happens as sample size goes up for sampling distribution
|
becomes more normal |
|
to use central limit theorem for proportions what must be met |
np must be greater than or equal to 10 and n(1 - p) must be greater than or equal to 10
|