• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/13

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

13 Cards in this Set

  • Front
  • Back
Problems with one-way analysis
1. Potentially distorted by correlations among rating variables.
2. Does not consider inter-dependencies between rating variables in the way they impact what is being modeled.
Problems with classical linear models
1. It is difficult to assert normality and constant variance for response variables
2. The values of the dependent variable (the Y variable) may be restricted to positive values, but the assumption of normality violates the restriction.
3. If Y is always positive, then intuitively the variance of Y moves toward zero, so the variance is related to the mean.
4. The linear model only allows for additive relationships between predictor variables, but those might be inadequate to describe the response variable.
Benefits of GLMs
1. The statistical framework allows for explicit assumptions about the nature of the data and its relationship with predictive variables.
2. The method of solving GLMs is more technically efficient than iterative methods.
3. GLMs provide statistical diagnostics which aid in selecting only significant variables and validating model assumptions.
4. Adjusts for correlations between variables and allows for interaction effects.
Steps to solving a classical linear model
1. Set up general equation in terms of Y, betas, and X's.
2. Write down an equation for each observation by replacing X's and Y with observed values in data. You will have the same number of equations as observations in the data. For observation i, the equation may contain some beta values and will contain epsilon_i.
3. Solve each equation for the epsilon_i.
4. Calculate the equation for the Sum of Squared Errors (SSE) by plugging in the epsilon_i^2 formulas. SSE = sum(epsilon_i)
5. Minimize SSE by taking derivatives with respect to each beta and setting them equal to zero.
6. Solve the system of equations for the beta values.
Components of classical linear models
1. Systematic - The p covariates are combined to give the "linear predictor" eta, where eta = sum(beta_i*X_i)
2. Random - The error term epsilon is normally distributed with mean zero and variance sigma^2. Var(Y_i) = sigma^2.
3. Link function - Equal to the identity function.
Components of a Generalized Linear Model
1. Systematic - the p covariates are combined to give the linear predictor eta, where eta = sum (beta_i * X_i)
2. Random - Each Y_i is independent and from the exponential family of distributions. Var(Y_i) = theta*V(mu_i)/ w_i
3. Link Function - Must be differentiable and monotonic.
Common exponential family distribution variance functions
normal - V(x) = 1
Poisson - V(x) = x
Gamma - V(x) = x^2
Binomial - V(x) = x*(1-x) for # trials = 1
Inverse gaussian - V(x) = x^3
Tweedie - V(x) = 1/lambda * x^p where p<0 or 1<p<2 or p>2
Methods of estimating the scale parameter
1. Maximum likelihood (not feasible in practice)
2. Moment estimator (Pearson chi-square statistic):
phi = 1/(n-p) * sum (w_i * (Y_i - mu_i)^2/V(mu_i))
3. Total devience estimator
phi = D/(n-p)
Common link functions
Identity - X and X
Log - ln(X) and exp(X)
Logit - ln(x/(1-x)) and exp(x)/(1+exp(x))
Reciprocal - 1/x and 1/x
Common model forms for insurance data
1. Claims frequencies / counts - multiplicative poisson (log link function, poisson error term)
2. Claim severity - multiplicative gamma (log link function, gamma error term)
3. Pure premium - tweedie (compound of poisson and gamma)
4. Probability (policyholder retention) - logistic (logit link function, binomial error term)
Aliasing and near-aliasing
Aliasing is when there is a linear dependency among the covariates in the model. Types of aliasing:
1. Intrinsic aliasign - when the linear dependency occurs by definition of the covariates.
2. Extrinsic aliasing - when the linear dependency occurs by the nature of the data
3. Near-aliasing - when covariates are nearly linearly dependent, but not perfectly linearly dependent.
Ways to decide whether to include a factor in the model
1. Size of confidence intervals (usually viewed graphically in practice).
2. Type III testing
3. See if parameter estimate is consistent over time
4. Intuitive that factor should impact result
Type III test statistics
1. chi-square test statistic = D*1 - D*2
2. F test statistic (D1-D2)/(df1 - df2)D2/df2 ~ F(df1 - df2),df2