Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
14 Cards in this Set
- Front
- Back
Hierarchical Clustering |
Data is not partitioned (i.e. clustered) in one step |
|
Hierarchical Clustering |
Two methods: Agglomerative and Divisive |
|
Agglomerative |
Starts with each record/item in its own group, and then combines groups and more popular |
|
Divisive |
Starts with all records in one large group and then splits the groups |
|
Dendogram |
-Diagram illustrating the fusions or divisions at successive stages -Large vertical jumps in the dendogram indicate the fusion (in agglomerative) or division (in divisive) imply the clusters are ‘further apart’ |
|
Dendogram |
How to use: -If you want k clusters, draw a perfectly horizontal line that intersects k of the vertical lines on the dendogram -Everything that is grouped below each of the intersected vertical lines is in the same cluster |
|
Measuring Distance |
-Numerous ways to measure the distance between two clusters and/or data points -Most common is to use Euclidean distance formula: Same way you measure distance between two points on a graph |
|
How do you measure distance between two clusters (which may have many data points)? |
Single Linkage Complete Linkage Average Linkage And others (Average group linkage, ward’s hierarchical) |
|
Single Linkage |
measure the Euclidean distance between the two closest data points from each cluster |
|
Complete Linkage |
measure the Euclidean distance between the two furthest data points from each cluster |
|
Average Linkage |
calculate the average Euclidean distance between all possible pairs of data points between the two clusters |
|
Association Rule Mining
|
-Seek to find interesting association and/or correlation relationships within large data sets -Typically found in market basket analysis: -->Attempt to determine which groups of items are commonly purchased together by customers -Attempts to find items sets which are disjoint -->i.e. the items in one item sets do not belong to any other item set |
|
Lagging Measures |
Tell what has happened (are often external business results such as customer satisfaction) |
|
Leading Measures |
Can be used to predict what will happen (are often internal metrics such as employee satisfaction, billing accuracy) |