Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
16 Cards in this Set
- Front
- Back
What is the formula for linear regression |
Y = f(X) + ę |
|
How do we predict y for a given X = x in linear regression |
f(x) = E(Y| X = x) Expectation average of all ys given X is x |
|
What happens when you are trying to predict y for given x but there’s no value for y in the dataset |
Use KNN Nearest neighbors f(x) = E(Y| x’ € N(x)) |
|
What is the limit of KNN |
Works up for small dimension P <= 4 Large N |
|
What is the curse of dimensionality |
Nearest neighbors tend to be far away in high dimensions It’s hard to find neighbors with high dimension and stay local |
|
Why do we not calculate MSE with training data? |
Can have high bias for overfitted model. Use test data instead |
|
What is bias variance trade off |
In choosing model, as the flexibility of model increases, variance increases and bias decreases. So there’s a trade off |
|
What is the expectation average of variability and bias |
Back (Definition) |
|
What is bayes optimal classifier |
Back (Definition) |
|
How do we measure performance in classification |
ErrTe = Avei € Te I [yi =/= C(xi)]
ErrTe = Error on Test set |
|
Which classification technique has the lowest error |
Bayes classifier |
|
What does linear regression assume? |
That dependence of Y on X1, X2 etc is linear It’s not actually true |
|
Formula for linear regression |
Back (Definition) |
|
How do you train a linear regression model |
The model needs to estimate 2 variables, intercept and slope or coefficients and parameters Train using LEAST SQUARES Using residual sum of squares RSS = sum of i = 1-n ei Where ei = yi - yi’ Or = yi - beta0 - beta1 xi
|
|
How do you estimate accuracy of coeffcient estimates in linear regression? |
Back (Definition) |
|
How is confiedence interval used |
This means that if we run 100 iterations of trained beta and get the confidence interval, and 95 of them have the true value of beta, then we have 95% confidence |