Descriptive statistics

describe, organize, summarize data Ex: avg. cholesterol values


Inferential statistics

make inferences based on data
Ex: sample cholesterol values from RANDOM # students 

a population

observations or measurements of ENTIRE subjects


a sample

a subset of the populationmeasurements of PARTIALLY selected subjects


another word for observation

element (X)


simple random sample

every element has equal prob. of being included
Ex: drawing fromm stack of 52 playing cards 

How do you eliminate biased samples?

True randomization...Stratified randome samples


stratified random sample

population divided into homogenous groups or strata (age, ethnicity, gender)


cluster sample

based on geographic areas
used when too expensive to draw simple random sample 

systemic sample

ex: select every 5th student


Facts about probability

cannot be negativ
expressed as decimals lie between 0 and 1 1p=probabilty event wont occur 

When do you use the addition rule?

when the events are mutually exclusive


When do you use the multiplication rule?

When 2 or more events are independent and both could happen at the same time


Binomial distribution is normally...

to describe inheritence of genetic disease. The p with ONLY 2 possibilities


Nominal data

organized into qualitative groups: male/female


Ordinal data

data organized in ranking order
DOES NOT provide info on size of INTERVAL b/n data 

Interval data

data organized into meaningful order with meaningful intervals in b/n
DO NOT have ABSOLUTE ZERO EX: centrigrade scale 

Ratio data

data organized with meaningful intervals
DO have ABSOLUTE ZERO Ex: Kelvin scale, seconds, days, pulses/min 

cumulative frequency distribution

% of elements lying within and below each class interval


Ogive curve

Sshaped curve for cumulative frequency


What do centile ranks tell us?

% of observations that fall below any particular score


What is the shape of the normal distribution curve?

symmentrical bell shaped
Gaussian distribution 

How can you tell whether a skewed distribution is positive or negative?

by the location of the tail of the curve


Ex of central tendencies

mean, median, mode


Central tendency that occurs with the greatest frequency

Mode


What is the median when you have an odd # of elements?

middle number


What is the median when you have an even # of elements?

avg of two middle scores


Mean

avg
VERY sensitive to extreme scores 

3 measures of variability

range, variance, standard deviation


Difference b/n lowest and highest scores

range


How do you determine the deviation score?

difference between elements and the mean
* the sum of the deviation scores for all elements is 0 

Can you use deviation scores to differentiate b/n 2 different normal distributions?

NO


What is the variance of a distribution

the mean of the squares of all the deviation scores.
1. find deviation scores 2. square each deviation score 3. obtain mean 

How do you find the standard deviation?

it is the square root of the variance.


What % of the distribution falls within +/ 1 s.d. of the mean?

68%


What % of the distribution falls within +/ 2 s.d. of the mean?

95%


What % of the distribution falls within +/ 3 s.d. of the mean?

99.7%


if element lies above the mean, it will have a ____ z score?

positive


if element lies below the mean, it will have a ____ z score?

negative


how to find z score

subtract mean from element and divide by s.d.


What is used for finding the probability that a random element will have a score above or below a mean of the distribution?

z score


What type of statistics makes conclusions about a population?

inferential


When plotted, what type of distribution will you see when plotting the random sampling distribution of means?

normal distribution even though the shape of the pop. distribution is rectangular


Are confidence limits 1 tail or 2 tails?

ALWAYS 2 tails b/c you need a max and min value


What is CI (confidence interval)?

the diff. b/n the upper and lower confidence limits


CI decreases in proportion to what?

the square root of the sample size (to halve the CI, the sample size must be increased 4 fold)


How is precision proportional to the sample size?

precision = square root of n
OR (precision)^2 = n 

As you increase sample size, what happens to the width of the CI?

width narrows


To double precision, how must you change the sample size?

multiply sample size by 4 to double precision


When something is precise, is it scattered or clustered?

clustered


When something is inaccurate, it it biased or unbiased?

biased


When are statistics precise?

when they are immune from random variation


precision is shown by the ____while accuracy is shown by the ______ b/n the mean of the random sampling distributions of means and the true pop. mean.

precision is shown by the width of the distributions (inversely)
accuracy is shown by the distance (inversely) 

Do you use the z score or t score when making inferences about means that are based on estimates of pop. parameters?

t score


For any given proportion of the distribution, is z constant? t?

z is constant while t is NOT constant b/c it depends on the size of the sample


When are z and t similar?

when the sample size is larger than 100


What do t values depend on?

degrees of freedom (df=n1)


when do you use 1 tail? 2 tails?

1 tail is directional (improves, impairs)
2 tails is NONdirectional (affects) 

What is alpha in hypothesis testing?

decision criterion/significan level: the point when the difference b/n the sample mean and the hypothesized population is due to chance or due to real effect


What is the Central Limit Theorem?

States that a mean of random sampling means will be very close to the true pop. mean


What is the conventional level of alpha?

0.05


If the probability that the sample mean could have come from the hypothesized pop is less than or equal to 0.05 what happens to the null hyp?

it is rejected


What is the range of acceptance?

the middle 95% distribution


What are the limits of the area of rejection defined by?

the critical values


What does it mean in terms of significance when the null hyp is rejected?

that the diff. b/n the sample mean and the hypothesized mean is statistically significant b/c it was rejected at the 0.05 level
(SIGNIFICANT) 

What does it mean in terms of significance when the null hyp is accepted?

that the diff. b/n the sample mean and the hyp population mean failed to reach statistical significance (NOT signifanct)


What does significant at p<0.05 mean?

an investigator can be 95% sure that the result was not obtained by chance. The diff was significant or real


What is a type I error?

rejecting the null when it is true (false negative)


What is a type I error also referred to as?

alpha error


What is a type II error?

accepting the alternative hyp when it is actually false (false postitive)


A type II error is also known as?

beta error


1beta is what?

power of the test


what does power of the test tell us?

the probability that alt. hyp will be rejected


A test is reqd to a have a power of at least what to be acceptable?

0.8


What are the 2 errors of hypothesis testing?

alpha error: rejecting a null that is actually true
beta error: accepting a alt. hyp that is false 

what is the correlation b/n power of test and alpha?

power increases as alpha increases


are 1 or 2 tailed tests more powerful?

1 are more powerful b/c they are more strict


What is the chi square test for?

a test of proportions: testing hypothesis for nominal scale data


What's the difference b/n experimental studies and nonexperimental?

experimental: give drug to experimental group and compare effect w/ control group that hasn't taken drug
nonexperimental: observe the effect of drugs by comparing 2 groups who have ALREADY taken the drugs and who have not taken anyNo ethical issues 

Purpose of clinical trials?

used to evaluate the effects of treatments and to isolate one factor by holding all other factors constant


What is the best method of assignment in RCCT trials?

randomization to reduce the selection bias so that any difference that appears b/n 2 groups at the end of the study can be attributed to TX


What is the diff. b/n double blind studies and singleblind studeis?

doubleblind: neither subjects nor investigators know
singleblind:only the subject is unaware and isn't as effective b/c humans can't control their emotions 

In RCCT, how are the effects of confounding variables reduced?

by matching: 48 yr old male vs. 42 yr old male to see effect of drug (not 48 vs. 21) need to eliminate confounding variables: age and gender


Explain crossover designs.

a repeated measures design b/c the measurements are repeated w/n each patient at different times
Ex: patient A: drug 1st month, washout 2nd month, placebo 3rd month 

What study is the first method used to study a particular, new diseasealso called exploratory studies?

Descriptive studies: most powerful method to study new disease


What studies fall under the category: nonexperimental studies?

descriptive studies
analytical studies cohort studies casecontrol studies caseseries studies prevalence surveys 

What do analytical studies do?

aim to test hypothesis or to prove explanations about a disease after observing the particular disease


What happens in a cohort study?

group of young, healthy people selected and observed for an extended period (15+yrs) and followed forward from a particular point in time
Cohort= prospective 

Advantages to cohort study?

no ethical problems
lifestyle and health managment are the choice of individuals not investigators establish absolute risk NO bias 

Disadvantages to cohort study?

timeconsuming
expensive impractical for RARE diseases 

Describe a case control study.

comparison is made b/n the cases (who have disease) and the controls (who do not)
Retrospective b/c they start w/ the outcome and then look back into the past 

Advantages of casecontrol study?

quick and cheap
important for RARE diseases need few subjects can establish multiple potential causes 

Disadvantages of casecontrol study?

highest degree of recall bias
loss in info of risk factors if people die cannot prove a causeeffect relationship 

Describe caseseries studies.

Reports or presentations of a disease.
NO Controls DO NOT follow up used to present new info about patients with RARE disease and develop new Ho. 

Describe prevalence surveys "community surveys"

surveys of a WHOLE population
SINGLE examination of pop at a PARTICULAR point in time DO NOT follow up AKA: crosssectional studies` 

Advantages of prevalence surveys?

b/c of community study, info on wide range of disease and characteristics can be gathered and used for hypothesis


Disadvantages of prevalence study?

not usable for ACUTE diseases
loss of subjects leaving from the community 

Eqn for incidence rate

# new cases of disease/total # people at risk x unit time


eqn for prevalance rate

# new cases with disease/total # people at risk at a PARTICULAR time


when can use prevalance rate?

for stable chronic condition: hypertension, diabetes
NOT for ACUTE DISEASE: appendicitis, pulm embolism 

eqn for prevalence?

annual incidence rate x average duration (yrs)


eqn for mortality rate?

total # deaths/total # of people at risk x time


What happens to prevalence if you increase incidence?

bathtub: increase incidence increase prevalence


eqn for case fatality rate?

# deaths due to disease/total # cases
DOES NOT depend on time 

eqn for attack rate?

# people contracting disease/ total # people at risk
(at 1 time, 1 incidence) 

absolute risk =

incidence rate of disease


Which study is the best for determining the effect of a risk factor?

cohort study: identify the risk factor of the disease


relative risk =

incidence among those EXPOSED to risk factor/incidence amoun those not exposed to risk factor


absolute risk reduction:

control incidence  experimental incidence


NNT=

1 life/absolute risk reduction


cost estimate eqn=

cost per month x months x NNT


attributable risk =

incidence exposed  incidence nonexposed


What kind of study will be used to find the odds ratio?

case control


eqn for odds ratio?

A x D / B x C


What does it mean when the odds ratio = 1

risk factor is NOT related to disease


What does it mean when the odds ratio is LESS than 1?

the risk factor may be a protective factor against disease
