Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
Lecture 1: Regression Analysis for Public Health Studies

Lecture 1: Regression Analysis For Public Health Studies

by mwren91, Sep. 2016

Subjects: Biostatistics, Public Health, Regression Analysis

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/44

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

44 Cards in this Set

Front
Back

	Population	a group of items or events that are of interest to a study or experiment
	Sample	A sub-set of the population that will represent the population
	Parameter	an unknown fixed quantity; fixed value that indexes a distribution i.e. estimating parameters with a sample
	Variable	random quantity, has a distribution
	probability	chance to see a certain event with many repetitions of experiment
	Distribution	relative occurrence or frequency of a random variable
	statistic	a function of data, categorized as test statistic or estimator
	statistical inference	process of drawing conclusions from data subject to random variation
	Estimate vs. Estimator	estimate: the value that represents a sample population's point or interval estimator based on sample data estimator: a value, either point or interval, that is used to make inferences about the population ex. sample mean, sample median, confidence intervals
	Quantitative	concept of amount, numerical eg. continuous (no gaps ex. BMI, BWT), discrete (gaps, ex. integers, ER visits)
	Qualitative	concept of attribute, category eg. binary, multinomial, ordinal
	correlation	measures Direction and Strength of relationship
	regression methods	measures direction and form of relationship
	correlation coefficient	measure of linear association between two random variables Y <---> X p = -1 perfect negative linear association p = 0 no association p = 1 perfect positive linear association
	covariance	measure of how much two random variables change together
	sample correlation coefficient 1. Name of test? 2. What does this estimate? 3. Ranges for strength of association	1. Pearson correlation coefficient 2. point estimator of population correlation coefficient 3. 0-.25 poor to none 0.25-0.5 fair 0.5-0.75 fair to good 0.75-1 strong to excellent
	what is 'r'? What does it measure?	'r' represents the correlation coefficient. It measures the strength and direction of a relationship between two random variables
	What factor(s) influence 'r' value	outliers
	Does 'r' have units?	No, it is independent of measurement units; any two units can be compared
	Do the variables need to be hierarchically related?	No, they do not need to be 'predictor' and according 'response' variables
	Is 'r' adjusted for other variables?	No, not for simple pearson correlation coefficient
	Inference on p	need to ask questions on page 11-12
	Spearman Rank Correlation coefficient When is this test used?	- non-normal, non-parametric test Use for: 1. ordinal variables 2. non-normally distributed variables 3. data with outliers
	Regression	statistical technique for modeling the relationship between variables Identifies relationship between MEAN value of a random variable and corresponding values of one or more independent values Mean/avg Y <--> X
	Hills Criteria for causality	1. strength of association 2. Dose response relationship 3. Temporality 4. Specificity 5. Consistency 6. Repeatability/Coherence 7. Biological plausibility
	1. How do you chose what regression method to use? 2. When should each of these regression methods used? a. linear b. logistic c. Poisson / log-linear d. Cox regression	1. Type of couture variable (Y), e.g. continuous, binary, ordinal, etc. 2. a. continuous variables b. binary variables c. counts (discrete variables) d. time of event (time -> clocks -> cox)
	What is the primary purpose for regression methods?	1. Inference: to study the association between Y and X 2. Prediction: to predict Y for a given X
	What others purposes does regression serve?	1. To adjust for confounders 2. Evaluate interactions between different X variables regarding change in mean response
	Inference 1. What kinds of inferences can be made? 2. Describe the functions of each type.	1. estimation and hypothesis testing 2. Estimation: point/interval; interpretation of regression coefficients is important HT: answer questions about population; model parameters must be interpretable for inference to be meaningful
	Prediction	- predict response for unobserved samples - accuracy and precision take precedence over interpretability
	What is simple linear regression (SLR)?	one response variable (Y) vs. one explanatory variable (Covariate, X) to answer: Is Y linearly associated with X?
	What is the model for SLR?	Yi = Bo + B1x1 + e where: Yi = response Bo is intercept B1 is slope x1 is covariate e is random error
	What are the conditions for e or 'error'?	normal distribution with mean zero and constant variance
	What is Multiple Linear Regression?	regression analysis for multiple covariates
	What is the model for multiple linear regression?	Yi = Bo + B1x1 + ... + Bkxk + e where: one response for 'k' variables
	What are the three assumptions made when comparing two population?	1. Independence between popls 2. Normality on distribution of parameter 3. Homoscedasticity: Equal variance between popls
	Ex. Simple linear regression Comparison of mean blood pressure between two populations, two populations in comparison are made based on weight, Y1=150 lbs and Y2=151 lbs 1. What is the response variable and the covariate? 2. How would you represent normal distribution of sample mean blood pressure for each population?	1. response: mean blood pressure; covariate is weight 2. Y1 ~ N(ux=150, o^2), Y2 ~ N(ux=151, o^2)
	How would you represent the population regression line if we were able to collect data from all Rutgers students, obtaining the true mean blood pressure at targeted weights?	Uylx = Bo + B1x where: Uylx is mean y at a given x-value (ex. mean blood pressure at 153 lbs) Bo is intercept B1x is slope and covariance
	Since we are unable to obtain data from the entire Rutgers community, what is the resultant model for estimating mean blood pressure?	Yi = Uylx + ei where: Yi is the sample population mean blood pressure Uylx is the population mean blood pressure ei is unknown random error
	How would you word results indicating a good relationship between the data and the linear model?	The mean of response variable has a linear relationship with explanatory variable x.
	What is the primary assumption about your data when you use linear regression?	That the relationship between your response and explanatory variable is linear.
	What are the assumption for errors?	1. Normal distribution with mean: 0 and variance o^2 (ei ~ N(0, o^2) 2. errors are independent 3. Same variance
	What are the assumptions for a response variable?	1. Normal distribution with mean (Uylx) and variance o^2 2. Same variance (homoscedasticity) 3. Independence
	Are covariates (Xk) fixed or random?	Both! We are under the assumption that they are fixed for the purposes of simplifying the model. BUT... they may, in fact, be random, due to causes such as... measurement error and natural fluctuations

Share This Flashcard Set

Set the Language

Related Flashcards

Lecture 1: Regression Analysis For Public Health Studies

Add to Folders

Upgrade to Cram Premium

Card Range To Study

44 Cards in this Set