Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
JH Machine Learning 1

Jh Machine Learning 1

by katwills, Apr. 2017

Subjects: Machine Learning

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/40

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

40 Cards in this Set

Front
Back

	What are the 6 components of prediction?	Question Input Feature Algorithm Parameters Evaluation
	What are the properties of good features?	lead to data compression Retain relevant information Are created based on expert application knowledge
	What are common mistakes with features?	Trying to automate feature selection not paying attention to data-specific quirks Throwing away information unnecessarily
	What criteria is used to evaluate machine learning methods?	Interpretable Simple Accurate Fast Scalable
	What is sample error?	The error rate you get on the same data set you used to build your predictor
	What is out of sample error?	The error rate you get on new data set.
	What is another term for out of sample error?	Generalization error
	What is more important, sample error or out of sample error?	Out of sample error
	What is the reason for a larger out of sample error compared to in sample error?	Overfitting
	Why does a machine learning method overfit?	The algorithm captures both the signal and the noise
	What are the 6 steps in predictive study design?	Define your error rate Split data into training, testing, validation Pick features Pick method Apply method to test data and refine Apply method to validation data
	How should training, test and validation data sets be separated?	Randomly
	What are the different types of classification errors?	True positive False positive True negative False negative
	What is the formula for sensitivity?	True Positive / (True Positive + False Negative)
	What is the formula for specificity?	True negative / (False Positive + True negative)
	How do you plot ROC curves?	Y axis: True Positive X axis: False positive
	What does ROC stand for?	Receiver operating characteristic
	How do you evaluate an ROC curve?	The more area under the curve the better
	What is the main principal of cross validation?	Train the model repeatedly with only a subset of the training data Use the excluded data to evaluate the models
	What are 4 use cases of cross validation?	Picking variables to include in a model Picking the type of prediction function to use Picking the parameters in the prediction function Comparing different predictors
	What are 2 different cross validation methods?	K-folds Leave one out
	Caret method to preprocess data?	preProcess
	What are 4 useful caret functions to partition data?	createDataPartition createResample createTimeSlices createFolds
	What caret method is used to train a model?	train
	What caret method is used to make predictions using a model and data?	predict
	What caret function is useful for comparing predicted results with actual results?	confusionMatrix
	What property on a caret model give the results of the model?	finalModel
	What are 2 metrics that can be used to evaluate continues models?	Root mean squared error(RMSE) RSquared
	What are 2 metrics that can be used to evaluate categorical models?	Accuracy(fraction correct) Kappa(measure of concordance)
	What caret method creates a plot of all features against an outcome?	featurePlot
	What function breaks a group of data into quantiles based on number of bins?	cut2(Hmisc package)
	What attributes can be set on the preProcess method to standardize features on a data set?	method=c("center", "scale")
	What transformation tries to make continuous data look like normal data?	Box Cox
	What plot shows sample quantiles vs theoretical quantiles?	Normal Q-Q plot
	What algorithm is useful for imputing data?	k nearest neighbors?
	What method must be set on preProcess to impute the data using k nearest neighbors?	knnImpute
	What are the two levels of covariate creation?	level 1: From raw data to covariate level 2: Transforming tidy covariates
	What must be considered what converting raw data into covariates?	Summarization vs information loss
	What must categorical variables be for a ML algorithm to work?	dummy variables (0 or 1)
	What caret function converts categorical variables into dummy variables?	dummVars

Share This Flashcard Set

Set the Language

Related Flashcards

Jh Machine Learning 1

Add to Folders

Upgrade to Cram Premium

Card Range To Study

40 Cards in this Set