• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/44

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

44 Cards in this Set

  • Front
  • Back

Population

a group of items or events that are of interest to a study or experiment

Sample

A sub-set of the population that will represent the population

Parameter

an unknown fixed quantity; fixed value that indexes a distribution




i.e. estimating parameters with a sample

Variable

random quantity, has a distribution

probability

chance to see a certain event with many repetitions of experiment

Distribution

relative occurrence or frequency of a random variable

statistic

a function of data, categorized as test statistic or estimator

statistical inference

process of drawing conclusions from data subject to random variation

Estimate vs. Estimator

estimate: the value that represents a sample population's point or interval estimator based on sample data




estimator: a value, either point or interval, that is used to make inferences about the population


ex. sample mean, sample median, confidence intervals

Quantitative

concept of amount, numerical


eg. continuous (no gaps ex. BMI, BWT), discrete (gaps, ex. integers, ER visits)

Qualitative

concept of attribute, category


eg. binary, multinomial, ordinal

correlation

measures Direction and Strength of relationship



regression methods

measures direction and form of relationship



correlation coefficient



measure of linear association between two random variables




Y <---> X




p = -1 perfect negative linear association


p = 0 no association


p = 1 perfect positive linear association

covariance

measure of how much two random variables change together

sample correlation coefficient




1. Name of test?


2. What does this estimate?


3. Ranges for strength of association

1. Pearson correlation coefficient


2. point estimator of population correlation coefficient


3. 0-.25 poor to none


0.25-0.5 fair


0.5-0.75 fair to good


0.75-1 strong to excellent

what is 'r'? What does it measure?

'r' represents the correlation coefficient.




It measures the strength and direction of a relationship between two random variables

What factor(s) influence 'r' value

outliers

Does 'r' have units?

No, it is independent of measurement units; any two units can be compared

Do the variables need to be hierarchically related?

No, they do not need to be 'predictor' and according 'response' variables

Is 'r' adjusted for other variables?

No, not for simple pearson correlation coefficient

Inference on p



need to ask questions on page 11-12

Spearman Rank Correlation coefficient




When is this test used?

- non-normal, non-parametric test




Use for:


1. ordinal variables


2. non-normally distributed variables


3. data with outliers



Regression

statistical technique for modeling the relationship between variables




Identifies relationship between MEAN value of a random variable and corresponding values of one or more independent values




Mean/avg Y <--> X

Hills Criteria for causality

1. strength of association


2. Dose response relationship


3. Temporality


4. Specificity


5. Consistency


6. Repeatability/Coherence


7. Biological plausibility



1. How do you chose what regression method to use?




2. When should each of these regression methods used?




a. linear


b. logistic


c. Poisson / log-linear


d. Cox regression

1. Type of couture variable (Y), e.g. continuous, binary, ordinal, etc.




2.


a. continuous variables


b. binary variables


c. counts (discrete variables)


d. time of event (time -> clocks -> cox)

What is the primary purpose for regression methods?

1. Inference: to study the association between Y and X


2. Prediction: to predict Y for a given X

What others purposes does regression serve?

1. To adjust for confounders


2. Evaluate interactions between different X variables regarding change in mean response

Inference




1. What kinds of inferences can be made?


2. Describe the functions of each type.

1. estimation and hypothesis testing


2.


Estimation: point/interval;


*interpretation of regression coefficients is important




HT: answer questions about population;


*model parameters must be interpretable for inference to be meaningful

Prediction

- predict response for unobserved samples


- accuracy and precision take precedence over interpretability

What is simple linear regression (SLR)?

one response variable (Y) vs. one explanatory variable (Covariate, X)




to answer:


Is Y linearly associated with X?



What is the model for SLR?

Yi = Bo + B1x1 + e




where:


Yi = response


Bo is intercept


B1 is slope


x1 is covariate


e is random error

What are the conditions for e or 'error'?

normal distribution with mean zero and constant variance

What is Multiple Linear Regression?

regression analysis for multiple covariates

What is the model for multiple linear regression?

Yi = Bo + B1x1 + ... + Bkxk + e




where: one response for 'k' variables

What are the three assumptions made when comparing two population?

1. Independence between popls


2. Normality on distribution of parameter


3. Homoscedasticity: Equal variance between popls

Ex. Simple linear regression



Comparison of mean blood pressure between two populations, two populations in comparison are made based on weight, Y1=150 lbs and Y2=151 lbs




1. What is the response variable and the covariate?


2. How would you represent normal distribution of sample mean blood pressure for each population?

1. response: mean blood pressure; covariate is weight


2. Y1 ~ N(ux=150, o^2), Y2 ~ N(ux=151, o^2)

How would you represent the population regression line if we were able to collect data from all Rutgers students, obtaining the true mean blood pressure at targeted weights?

Uylx = Bo + B1x




where:


Uylx is mean y at a given x-value (ex. mean blood pressure at 153 lbs)


Bo is intercept


B1x is slope and covariance

Since we are unable to obtain data from the entire Rutgers community, what is the resultant model for estimating mean blood pressure?

Yi = Uylx + ei




where:


Yi is the sample population mean blood pressure


Uylx is the population mean blood pressure


ei is unknown random error

How would you word results indicating a good relationship between the data and the linear model?

The mean of response variable has a linear relationship with explanatory variable x.

What is the primary assumption about your data when you use linear regression?

That the relationship between your response and explanatory variable is linear.

What are the assumption for errors?

1. Normal distribution with mean: 0 and variance o^2 (ei ~ N(0, o^2)


2. errors are independent


3. Same variance

What are the assumptions for a response variable?

1. Normal distribution with mean (Uylx) and variance o^2


2. Same variance (homoscedasticity)


3. Independence

Are covariates (Xk) fixed or random?

Both!




We are under the assumption that they are fixed for the purposes of simplifying the model.




BUT... they may, in fact, be random, due to causes such as... measurement error and natural fluctuations