Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
Statistics Spring Semester - Midterm

Statistics Spring Semester - Midterm

by justarando, Mar. 2017

Subjects: statistics

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/26

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

26 Cards in this Set

Front
Back

	The General Linear Model equation	Yi = β0 + β1X1i + εi
	The Generalized Linear Model equation	Logit(Y) = β0 + β1X1 + ε
	The Generalized Linear Model vs The General Linear Model	"The Generalized Linear Model - -Flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. -Allowing the linear model to be related to the response variable via a link function (Logit). -Allowing the magnitude of the variance of each measurement to be a function of its predicted value "
	What are the types of logistic regression? (based on types of outcomes)	" ○ Binary logistic (2 categories) ○ Multinomial outcomes (3 or more categories) "
	Binary logistic regression	Categorical or continuous variables and outcome is 2 categories
	When do we use a logistic regression?	-Categorical/continuous/nominal variables (binary logistic or multinomial outcomes)-Can look at interactions (same way you would in a linear regression) by computing product terms. DIFFERENCE is outcome is categorical/nominal. -Can look at interactions same way you would in a linear regression by computing product terms.
	Why do we use logistic regression?	"1. Prediction is going to be based on likelihood, not levels2. Assumptions of ordinary least squares regression are violated by having a dichotomous variable. 3. Thus, we need to expand the General Linear Model (which only can be used for continuous outcomes) and make it even more general and call it the Generalized Linear Model (which can be used for categorical, ordinal, and very skewed or weird distributions)"
	What is the non-parametric equivalent of a t-test?	*See notes
	What are the major violations of OLS assumptions that call for use of logistic regression?	-Measurement assumption-Linear assumption-Heteroscedastity and non-normal residuals-Scale of predicted score of the model needs to match the scale of the observed model
	Major violation of OLS measurement assumption	Outcome is not on interval or ratio scale
	Major violation of OLS linear assumption	-The relationship between the predictor and the predicted values is ASSUMED to be non-linear (??)
	Major violation of OLS Heteroscedastity and non-normal residuals assumption	?? see notes from last class or fall semester
	Major violation of OLS assumption that scale of predicted score of the model matches the scale of the observed model	Predicted values that we get from logistic regression are likelihood/probability. When actual observed scores are either 0 or 1, assumption is violated.
	Logistic Regression Curve	" • Variance in the residuals is not consistent across the predictor. At extreme levels, violation is bigger…. That represents heteroscedastity (violation of homoscedastity) ○ Distance between predicted and observed is not constant "
	Logit(Y) = β0 + β1X1 + ε <--what is being predicted?	Predicting likelihood that outcome is either 0 or 1. Very similar to GLM and linear regression equation BUT uses the Logit(Y)
	Range that probability can take?	"Expressed in a ratio from 0 to 1 □ 0 - definitely not going to happen □ 1 - definitely will happen "
	What does P(Y=1) refer to?	Probability that event will happen (aka probabilty that Y will be classified as 1 on the DV)
	Express P(Y=0) in terms of P(Y=1)	P(Y=0) = 1 - P(Y=1)
	Why don't we use P(Y=1) = β0 + β1X1 + ε equation for logistic regression?	Problems with this are based on mathematical characteristics of probability --- observed and predicted values are restricted between 0 and 1. Predicted values may fall less than 0 (i.e. could get a negative probability).
	What are the Odds?	Probability that Y = 1 relative to probability of Y =/= 1 P(Y=1)/(1- P(Y =1))
	What does it mean if Odds are less than 1?	You're less likely to be categorized as a 1 than 0 on the DV
	What is range of Odds?	Ranges from 0 --> 1 --> infinity.
	What happens as the difference between P(Y) = 1 and P(Y)=/=1 gets bigger?	The bigger the diff between the two, the closer the odds ratio gets to infinity or 0!
	What does it mean if Odds = 1, is > 1, or is <1, respectively?	= 1 equal likely to be in either condition>1 = more likley to be in category coded as 1<1 = more likley to be coded as a 1 than a 0
	Why are odds better than Probability? Why are they still nonoptimal?	Better than probability because odds have no upper limit!Nonoptimal because lower limit of 0 -- could create problems because you could get predicted values below 0. Also you could get predicted values on the same scale as observed values (??)
	Logit	Logit = Natural Logarithm of the oddsLogit (Y) = β0 + β1X1 + ε <--logistic regression equation

Share This Flashcard Set

Set the Language

Related Flashcards

Statistics Spring Semester - Midterm

Add to Folders

Upgrade to Cram Premium

Card Range To Study

26 Cards in this Set