 Shuffle Toggle OnToggle Off
 Alphabetize Toggle OnToggle Off
 Front First Toggle OnToggle Off
 Both Sides Toggle OnToggle Off
 Read Toggle OnToggle Off
Reading...
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
Play button
Play button
60 Cards in this Set
 Front
 Back
regression predicts what?

causation

corelation shows what?

association, which is not causation

Regression formulas
SST(total sum of squares) =? 
SSR + SSE= sum of (yy bar) squared

Regression
sum of squares 
S=sum of all vertical deviations from proposed line squared
S=e1^2+e2^2+e3^2 
Regression
the best straight line, is one that ? 
minimizes S
smallest S is the best line 
A least squares regression selects line with?

the lowest sum of squared errors

Regression
The coefficient of determination is? the higher its value the more? 
R^2
higher value=more accurate 
R^2 measures what?

relationship btween INDV and DV

Regression
Standard Error equation 
Square root (SSE/ NK
N=# obs in sample K=number of INDV 
Regression
Interpolation 
prediction using value of IDV within observed range
uncontroversial 
Regression
Extrapolation 
preduction using value of IDV outside observed range
should be avoided, if poss 
Simple Linear Regression
IDV AKA DV AKA 
idv=predictive variable
dv=response variable 
Sig Testings in Regression
explain the diff tests involved 
Ftes= judges if explanatory variable(indep), adequately describe outcome variable
Ttest= applies to indivd. IDV(explanatory)says if this particular variable has effect on outcome,holding others same R^2measures strength of relationship of IDV and DV 
in simple linear regression
R^2=? 
r^2
rcorrelation coeffienct R^2coefficient of determination 
Multiple Linear Regression
E= B0. B1, Bn = 
statistics error in population
residual in sample B's=unkown parameters 
Multiple Regression
3 ways to do it, what are they? 
1)Backward multiple regression
2)Forward multiple regression 3)Mix of back/forward multiple regression 
backward multiple regression
AKA? How to do it? how to evaluate individual relationship 
AKA reverse elimination
drop least sig variables one at time, til left with only sig variables t test evaluates the inividual relationship 
forward multiple regression

pick IDV that explains most variation in DV, then the next one1 at time, till no variables significant explain variation

mixed of backwards and forward

do forward selection first, but drop variables which become no longer significant after introduction of new variables
not used much 
Multiple Regression
Dummy Variable aka? what does it do? 
indicator variable
introduces qualitative gives values of 0 or 1 to indicate absence of prescence of caregorical info female= 1 when pt is female female=0 when pt is male 
Assumptions for Multiple Linear Regresions

normal distribution
variance of regression line is same for all values of explanatory variables explanatory variables (IDV) are not correlated 
Nonlinear functions can be fit as?

regressions
logarithmic, exponential 
what is multicollinearity

problem in interpretation of regression coefficients when IDV are correlated

what is collinearity

exists when IDV are correlated

detecting multicollinearity

chekin correlation coefficient matrix
Ftest sig with many insig t VIF>5 
what is VIF

variance inflation factor
quantifies sverity of multicollinearity in ordinary least squares regression gives variance of an estimated regression coefficient is increased cuz of collinearity 
how to correct for multicollinearity

pick IDV with low collinearity
use stepwise regressionwhere u put most correlated variable in equation 1st, then next most order of entry of variables matters here 
what is adjusted R^2

adjusts for inflation in R^2 caused by number of varibales in equation
as sample size increase >20 cases per variable, asjut is less needed basically go with asjusted R2 unless, sample size > IDV *20 
logistic regression
define AKA 
multivariate regression that uses max likelihood estimate to see relationship btwn CATEGORICAL dependent(Y) and multiple IDV(X)
AKA logit analysis 
Logistic transformation

event occurrence (NO, YES)
> PROB (0.....1) >ODDS (0...+INF) do P/1P for each value=odds then do log (o) and log (inf) = (inf to + inf) 
types of logistic regression

simple
multiple multinominal 
simple logistic regression
application DV ADV? 
relationship btwn single IDV(continuous or categorical) and single DVusually binary variable
example: DVyes/no IDV: yes/no 
Multiple Logistic Regression
application DV: IDV 
relationship btwn 2 or > IDV (continuous of categorical) and single DVusually binary
DVStroke(Yes/NO) IDVage, HTN, diabetes, gender 
Multinominal Logistic Regression
application IDV DV 
relationship btwn 2> IDV (cont or cat) and a single CATEGORICAL dv with more than 2 possible choices
DVHTN(bad/mod/ok) IDVage,race,meds,gender AKA polytomous logistic regression 
define
Odds Odds ratio(OR) 
Oddsratio of probability of success to prob of failure
=P/(1P) Odds Ratio(OR)ratio of adds an event occur in one group to odds of others OR=Oddsgrp1/Oddsgrp2 
What do each of these mean
OR=1 OR>1 OR<1 
OR=1 no association
>1 positive association of grp1 and grp2 <1 negative assoc. of grp1 + grp2 
Relative Risk
define AKA 
prob that member of exposed grp will develop disease to prob member of unexp develops same disease
RR=P(Disexp)/P(Disunexp) 
parametric stats
what class of stats? assumptions? 
its inferential
normally distributed estimation of at least 1 parameter at least 1 interval measures mean=mode=median 
example of parametric stats

peasrson correlation coefficient(r)
unpaired/paired t test ANOVA Regression 
Non parametric stats
assumptions? 
distributionfree stats
second class of inferential stats usually nominal or ordinal but CAN be continuous(nom + ratio) 
nonparametric stats
examples? 
chisquare
sign test fisher exact test 
sign test
define 
test the equality of median of 2 comparative groups
simplest nonparametric test used for quick look at data for parametric 
Sign Test
data? assumption computation? 
paired data(DV)
assumptioneach paired diff is meaningful comp1do diff of each paired data 2discard any zeros 3apply sign test 4 hypothesis testing 
chi square stats
define 
nonparametric test to c if
relatinship exists btwn caregorical variables gender, ins type, statisfaction 
chi square tests?

"goodness of fit" btwn observation and theorretical distribution
values of 0 to infinity 
chi square hypothesis

H0data follow specified distribution
Ha: does not follow 
chi square
data? assumption? computation? 
DVnominal or ordinal
assumptiondata are indep random samples from population compconduct a R*C contringency table compute expected value do formula hypothesis testing 
Chi Square
fe= 
(fr*fc)/N
r=row, c=column N=# of subjects df= (C1)*(R1) 
main diff in descriptive and inferential stats?

inferential has hypothesis testing

define correlation

interrelationship btwn 2 CONTINUOUS variablesinterval and ratio

Assumptions of correlation

1normalitynormal distribution
2Linearitylinear relationship 3homoscedascityerr or residual variance in model are identically distributed 
what statistic to test homoscedascity?

Ftest

correlation coefficient
what does it indicate ranges from? 
r
direction + strength of correlation 1.0 to 1.0 
pearson correlation refers to
spearman correlation refers to 
simply correlation(continuous variable)
spearalternative for pearson, continous varibale but not normal distribution(nonparametric) 
what is type 3 error

correctly reject Null, for wrong reason

correlation
null hypothesis? alt? how to test 
H0correlation between two is 0, uncorrelated
hais nonzero to test if r is sig diff from 0, we use t test for pearsons r 
correlation
how to calc df 
degrees of freedom is N2*****************

factors that influence the correlation

1correlation coeff(2)..closer r is to 1 or +1 the greater chance of significance
2sample sizelarger samples, greater chance of sig 3linearitycorrelation only exists in linear relationship 
correlation CANNOT be equated with?

causation

correlation coefficient ranges

0.2 very low
.2.4 low .4.6 mod .6.8 highly mod .81 very high 