• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

image

Play button

image

Play button

image

Progress

1/60

Click to flip

60 Cards in this Set

  • Front
  • Back
regression predicts what?
causation
corelation shows what?
association, which is not causation
Regression formulas
SST(total sum of squares) =?
SSR + SSE= sum of (y-y bar) squared
Regression
sum of squares
S=sum of all vertical deviations from proposed line squared
S=e1^2+e2^2+e3^2
Regression
the best straight line, is one that ?
minimizes S
smallest S is the best line
A least squares regression selects line with?
the lowest sum of squared errors
Regression
The coefficient of determination is?

the higher its value the more?
R^2

higher value=more accurate
R^2 measures what?
relationship btween INDV and DV
Regression
Standard Error equation
Square root (SSE/ N-K

N=# obs in sample
K=number of INDV
Regression
Interpolation
prediction using value of IDV within observed range

uncontroversial
Regression
Extrapolation
preduction using value of IDV outside observed range

should be avoided, if poss
Simple Linear Regression

IDV AKA

DV AKA
idv=predictive variable

dv=response variable
Sig Testings in Regression

explain the diff tests involved
F-tes= judges if explanatory variable(indep), adequately describe outcome variable

T-test= applies to indivd. IDV(explanatory)-says if this particular variable has effect on outcome,holding others same

R^2-measures strength of relationship of IDV and DV
in simple linear regression

R^2=?
r^2

r-correlation coeffienct

R^2-coefficient of determination
Multiple Linear Regression
E=

B0. B1, Bn =
statistics error in population

residual in sample

B's=unkown parameters
Multiple Regression
3 ways to do it, what are they?
1)Backward multiple regression
2)Forward multiple regression
3)Mix of back/forward multiple regression
backward multiple regression
AKA?
How to do it?

how to evaluate individual relationship
AKA reverse elimination

drop least sig variables one at time, til left with only sig variables

t test evaluates the inividual relationship
forward multiple regression
pick IDV that explains most variation in DV, then the next one-1 at time, till no variables significant explain variation
mixed of backwards and forward
do forward selection first, but drop variables which become no longer significant after introduction of new variables

not used much
Multiple Regression
Dummy Variable

aka?
what does it do?
indicator variable

introduces qualitative
-gives values of 0 or 1 to indicate absence of prescence of caregorical info

female= 1 when pt is female
female=0 when pt is male
Assumptions for Multiple Linear Regresions
-normal distribution
-variance of regression line is same for all values of explanatory variables
-explanatory variables (IDV) are not correlated
Nonlinear functions can be fit as?
regressions
-logarithmic, exponential
what is multicollinearity
problem in interpretation of regression coefficients when IDV are correlated
what is collinearity
exists when IDV are correlated
detecting multicollinearity
chekin correlation coefficient matrix

Ftest sig with many insig t

VIF>5
what is VIF
variance inflation factor

quantifies sverity of multicollinearity in ordinary least squares regression

gives variance of an estimated regression coefficient is increased cuz of collinearity
how to correct for multicollinearity
pick IDV with low collinearity

use stepwise regression-where u put most correlated variable in equation 1st, then next most

order of entry of variables matters here
what is adjusted R^2
adjusts for inflation in R^2 caused by number of varibales in equation

as sample size increase >20 cases per variable, asjut is less needed

basically go with asjusted R2 unless, sample size > IDV *20
logistic regression
define

AKA
multivariate regression that uses max likelihood estimate to see relationship btwn CATEGORICAL dependent(Y) and multiple IDV(X)

AKA logit analysis
Logistic transformation
event occurrence (NO, YES)
---> PROB (0.....1)
------->ODDS (0...+INF)

do P/1-P for each value=odds
then do log (o) and log (inf)
= (-inf to + inf)
types of logistic regression
simple
multiple
multinominal
simple logistic regression
application
DV
ADV?
relationship btwn single IDV(continuous or categorical) and single DV-usually binary variable

example: DV-yes/no
IDV: yes/no
Multiple Logistic Regression
application
DV:
IDV
relationship btwn 2 or > IDV (continuous of categorical) and single DV-usually binary

DV-Stroke(Yes/NO)
IDV-age, HTN, diabetes, gender
Multinominal Logistic Regression
application
IDV
DV
relationship btwn 2> IDV (cont or cat) and a single CATEGORICAL dv with more than 2 possible choices

DV-HTN(bad/mod/ok)
IDV-age,race,meds,gender

AKA polytomous logistic regression
define
Odds

Odds ratio(OR)
Odds-ratio of probability of success to prob of failure
=P/(1-P)

Odds Ratio(OR)-ratio of adds an event occur in one group to odds of others
OR=Oddsgrp1/Oddsgrp2
What do each of these mean
OR=1
OR>1
OR<1
OR=1 no association

>1 positive association of grp1 and grp2

<1 negative assoc. of grp1 + grp2
Relative Risk
define

AKA
prob that member of exposed grp will develop disease to prob member of unexp develops same disease

RR=P(Disexp)/P(Disunexp)
parametric stats

what class of stats?
assumptions?
its inferential

-normally distributed
-estimation of at least 1 parameter
-at least 1 interval measures

mean=mode=median
example of parametric stats
peasrson correlation coefficient(r)

unpaired/paired t test
ANOVA
Regression
Non parametric stats
assumptions?
distribution-free stats
second class of inferential stats

usually nominal or ordinal
but CAN be continuous(nom + ratio)
nonparametric stats
examples?
chi-square
sign test
fisher exact test
sign test
define
test the equality of median of 2 comparative groups

simplest nonparametric test

used for quick look at data for parametric
Sign Test
data?
assumption

computation?
paired data(DV)

assumption-each paired diff is meaningful

comp-1-do diff of each paired data
2-discard any zeros
3-apply sign test
4- hypothesis testing
chi square stats
define
nonparametric test to c if
relatinship exists btwn caregorical variables

gender, ins type, statisfaction
chi square tests?
"goodness of fit" btwn observation and theorretical distribution

values of 0 to infinity
chi square hypothesis
H0-data follow specified distribution

Ha: does not follow
chi square
data?

assumption?

computation?
DV-nominal or ordinal

assumption-data are indep random samples from population

comp-conduct a R*C contringency table
-compute expected value
-do formula
-hypothesis testing
Chi Square
fe=
(fr*fc)/N
r=row, c=column
N=# of subjects

df= (C-1)*(R-1)
main diff in descriptive and inferential stats?
inferential has hypothesis testing
define correlation
interrelationship btwn 2 CONTINUOUS variables-interval and ratio
Assumptions of correlation
1-normality-normal distribution
2-Linearity-linear relationship
3-homoscedascity-err or residual variance in model are identically distributed
what statistic to test homoscedascity?
F-test
correlation coefficient

what does it indicate

ranges from?
r

-direction + strength of correlation

-1.0 to 1.0
pearson correlation refers to

spearman correlation refers to
simply correlation(continuous variable)

spear-alternative for pearson, continous varibale but not normal distribution(nonparametric)
what is type 3 error
correctly reject Null, for wrong reason
correlation
null hypothesis?
alt?

how to test
H0-correlation between two is 0, uncorrelated

ha-is nonzero

to test if r is sig diff from 0, we use t test for pearsons r
correlation

how to calc df
degrees of freedom is N-2*****************
factors that influence the correlation
1-correlation coeff(2)..closer r is to -1 or +1 the greater chance of significance

2-sample size-larger samples, greater chance of sig

3-linearity-correlation only exists in linear relationship
correlation CANNOT be equated with?
causation
correlation co-efficient ranges
0-.2 very low
.2-.4 low
.4-.6 mod
.6-.8 highly mod
.8-1 very high