• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/19

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

19 Cards in this Set

  • Front
  • Back
Data Mining and Seven Steps
The process of discovering interesting patterns from big data. It involves data cleaning, data integration, data selection,. data transformation, pattern discover, pattern evaluation, and knowledge presentation
When is a pattern considered interesting?
If it is valid on test data with some degree of certainty with novel potential use, and can be easily understood by humans. This is called Knowledge discovery.
The major 4 Dimensions of data mining.
Data, knowledge, technology, applications
Data Warehouse
A repository for long-term storage of data from multiple sources, organized to leverage decision making. It has a unified schema. Allow for Online Analytical Processing
Multidimensional Data Mining
Sometimes called exploratory multidimensional data mining, it integrates core data mining techniques with OLAP-based Multidimensional Analysis
Data Mining Applications
Business intelligence, Web searching and Page ranking, Bioinformatics, Intrusion Detection - Cyber security, Fraud Detection
Data Mining Functionalities
Specific types of patterns or knowledge that can be found in a data mining task.
List of Functionalities of Data mining.
associations, correlation classification and regression,. cluster analysis, outlier detection
Data mining is interdisplinary, What are some of the different domains of data mining
Statistics, Machine learning, database and data warehouse systems, information retrieval.
4 Difference challenges in Data Mining.
Efficiency, scalability, diverse data types, mining methodology.
Define 6 aspects of data quality.
Data Quality is defined in terms of accuracy, completeness, consistency, timeliness, believability, and interpretability.
Data Cleaning

attempts to fill missing values, smooth out noise, identify outliers, and correct inconsistencies in data. Error Detection and Data transformation

Data Integration
Combines data from many sources to be used in a data warehouse.
Data Reduction

Techniques obtain a reduction representation of the data while minimizing the loss of information content.


Includes, (Dimension Reduction, Numerosity Reduction and Data Compression)

Data Transformation
Converts data into the appropriate format for mining. Includes Data Discretization, Normalization and concept hierarchy generation
Data Discretization
Transforms numeric data by mapping values to intervals or concept labels, Includes Binning, Histogram analysis, cluster analysis, decision tree analysis, and correlation analysis
What is concept hierarchy generation
assigning values to attribes
What is Nominal Data
Labels variables but doesn't have any quantitative value Ex: Brown = 2
What is Discrete Data
Based of counts are typical real numbers or integers. finite numers