• Shuffle
Toggle On
Toggle Off
• Alphabetize
Toggle On
Toggle Off
• Front First
Toggle On
Toggle Off
• Both Sides
Toggle On
Toggle Off
Toggle On
Toggle Off
Front

### How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key Play button Play button Progress

1/21

Click to flip

### 21 Cards in this Set

• Front
• Back
 Attribute types NominalOrdinalNumeric Central tendency MeanMedianMode Linear correlation Two attributes are linearly correlated if there exists a strong linear relation between them.Pearson's coefficient - measures the level of linear correlation Lazy vs. Eager learning Lazy = stores training data and waits until given an inputEager = decision trees. Construct a model before classifying Euclidean distance The distance in a space between two points Hamming distance The number of different bits between two codewords Decision trees Split on the attribute with the highest information gain. Confusion matrix What we did. C1 c2C1 true positive false negativeC2 false positive true negativeSensitivity = true positive recognition rate Specificity = true negative recognition ratePrecision = measure of exactnessAccuracy = recognition rate Support Percentage of transactions in D that contains a transaction with both A and B Confidence The percentage for rule A=>B in D Closed pattern If there exists no proper super set of X that has the same support Max pattern If X is frequent and there exists no superset Y that is frequent Agglomerative single linkage K clusters, 1 pr sample, merge clusters based on Euclidean distance, end when there are k clusters K-means Select number of clusters kAssign to random cluster Calculate centroids, average value in a clusterEuclidean distance, assign to closest centroid K-medoids K clustersK samples as medoidsAssign to closest medoidReplace medoids with the sample that minimises the cost, Euclidean distance GSP Apriori for sequencesMin-gap = minimum time between last item in one sequence and first item in the next sequenceMax-gap = max time between Window-size = max distance between first and last element in a sequence FSD Apriori for graphsIsomorphic graph = two graphs may be equal due to symmetries Canoncial labels = unique code representing a graph and all its isomorphic graphs Artificial neural network Input layer, hidden layer, output layerFeed-forwardTraining algorithm = evaluate quality of weights (error function), strategy to search for possible solutionsFor each data sample, each neuron executes:Accumulate error from next layerCalculate error contribution Backpropagate error to each neuron in previous layerWeight updatesOnline = weight updates after each sampleBatch = weight update after all samplesMini-batch = weight updates after several samples Self-organizing maps Two layers: input and output (feed forward)Each neuron contains a weight vectorBMU = best matching unit, neuron closest to inputOrdering phase = roughConvergence = fine-tuning Quantization error = distance between input and prototype Topographic error = if two BMU to an input are not adjacent Gaussian neighborhood function Calculate degree of neighborhoodShould decrees over time. Affect only neurons in the neighborhood Kohonens update rule Weight update based on:Distance in the map (neuron to BMU)Distance in input space (data point to weight)Epoch