• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/39

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

39 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)

What are the two modeling choices for GLM?

Distribution and Link functions

Any distribution belonging to what family can be chosen?

The linear exponential function

The mean u for any distribution belonging to the linear exponential function is embedded in its what?

Probability function

What function establishes the relationship between u and the predictors (the function g where g(u) = x transpose times B

Link function

What combo gives us MLR?

Normal distribution + identity link

MLR assumes that the target follows what distribution?

The Normal distribution

How does GLM relax some of MLR’s assumptions?

We can choose from any member of the linear exponential family

5 example distributions that belong to the linear exponential family

Binomial (fixed trials)


Gaussian


Gamma


Inverse Gaussian


Poisson

The distribution we choose relates to what?

The random component

The link function we choose relates to what?

The systematic component

If the target is continuous, it should be paired with one of what 3 distributions?

Normal, gamma, or inverse Gaussian

A binary target should be paired with what distribution?

Bernoulli

Bernoulli is made of successes and failures- two options like binary has

A count target should be paired with what distribution?

Poisson

What is a link function?

It links the mean target with predictors

Identity link g(u)

u

Log link g(u)

ln(u)

Logit link g(u)

ln(u/1-u)

The identity link is best for what values of u?

Real-valued u

The log link is best for what values of u?

Positive valued u

Logit link is best for what values of u

Between 0 and 1

How are the betas found for GLM?

Maximum likelihood estimation (MLE) finds the b’s that maximize the log-likelihood

The first important GLM metric

Maximized log-likelihoods

What is a maximized log-likelihood?

The log-likelihood evaluated at the maximum likelihood estimates

What do we call the maximized log-likelihood with the least number of betas?

l-null

What does we call the maximized log likelihood of the model with the most number of betas?

l-sat

l-sat has the most what?

Flexibility

What does the maximized log-likelihood measure?

The likelihood that the model and its estimated parameters match the data

Another meaning of what the maximized log likelihood measures?

How close the predicted target is to the target

So do we want a higher or lower maximized log-likelihood?

Higher

But we don’t want the maximized log likelihood to be too high to avoid what?

Overfitting

What is a second metric for GLM?

Deviance

Generally, a good fit has a big or small deviance?

small

What is the estimated variance of the target for observation I called?

Pearson chi-square statistic

Do we prefer higher or lower values for deviance?

lower

As p increases, what happens to the training RMSE?

Decreases

As p increases, what happens to the training deviance?

Decreases

What happens to the Pearson chi-square statistic as p increases?

Decreases

What happens to the maximized log likelihood as p increases?

Increases

To test the value of a beta parameter in GLM, we perform one of what two tests?

A z or t test