• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/28

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

28 Cards in this Set

  • Front
  • Back

RANSAC

Random sampling consensus


1. Sample two points to estimate line


2. Calculate numer of inliers or posterior likelihood for relation


3. Choose relation to maximize number of inliers


(in LMedS, Least Median of Squares)


4. Calculate error of all data


5. Choose relation to minimize median of errors

Robust regression

Gives good model estimate even if many outliers

LMS

Sample min number of matches to estimate model, calculate error (residual) for all data, choose sample that minimized median of residuals

Problems with regression methods

ransac: Need to choose good threshold


lms: no good solution if outliers > 50%


k-NN: performs worse in higher dimensions

k-NN regression

Similar to k-NN classifier (choose k closest points and take the average of responses)




Larger K = lower variance

Ridge regression

Sacrifice bias to decrease variance in LMS. Reduces coefficients

Lasso

Least Absolute Shrinkage and Selection Operator. Same as ridge regression, but can zero coefficients (perform feature selection)

Probability Based Methods: Pros/Cons

Pros: All aspects of learning, modelling and inference can be cast under the same theory




Cons: Hard to derive closed solutions (need approximations), inefficient for large datasets

Inference

Given training data D, estimate posterior probability of answer y:




P(y|x, D)

Inference parametric/non-parametric

P: Estimate optimal parameter θ^ from data and use it to compute posterior




NP: Estimate posterior by marginalizing out the parameter θ (use data instead)

Occam's razor

Choose the simplest explanation for the observed data

Naive bayes: What if none of instances with target val y have attr. xi?

Add pseudocounts (a form of regularization, or smoothing)

Perceptron learning

Incremental learning, weights only change if values are wrong.




wi <- wi + n(t - o)xi




Always converges if problem is solvable

Delta rule

Incremental learning, weights always change.




wi <- wi + n(t - wTx)xi




Converges in the mean




Will find optimal solution even if problem cannot be solved

Bagging

Bootstrap AGgregatING




(Only good for high V, low B classifiers)




Use bootstrap replicates of training data - sample with replacements




On each replicate use one model

Decision trees - Variance/Bias?

High variance (dependent on training data)




Low bias (Averages decision boundaries that are good approximations to the decision boundary)

Boosting

Loop:


1. Apply learner to weighted samples


2. Increase weights of misclassified samples

Adaboost

1. Train weak classifier (choose one with lowest error)


2. Compute reliability coefficient (error must be lower than 0.5, otherwise, break)


3. Update weights


4. Normalize the weights

Boosting properties

Test error -> 0 asymptotically




Why? Algorithm not satisfied with getting 0 training error

Weak classifier

ht(x) =


1 if f^jt(x) > theta t


-1 otherwise




Corresponds to a filter type and a threshold

Dropout

Remove unnecessary nodes in convolutional neural networks (ConvNets)

Random forests

1. Sample training data (same as bagging)


2. Feature selection at each node

PCA

Two criterion can be used:




1. Maximize variance


2. Minimize average squared error between x and its approximation

Information compression

Extract class characteristics, throw away the rest

CLAFIC

Class featuring information compression

Describe subspace methods for classification

For each class, compute a low dimentional subspace that represents the distribution in the class




Determine the class of unknown input by comparing which subspace best approximates the input

RSS

Residual sum of squares

EM

Expectation maximization