• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/43

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

43 Cards in this Set

  • Front
  • Back

Data Warehouse

- A logical collection of information– gathered from many different operationaldatabases– that supports business analysis activities anddecision-making tasks




- The primary purpose of a data warehouse is– to aggregate information throughout anorganization– into a single repository for decision-making purposes

Data Warehouse Fundamentals

- Extraction, Transformation, and loading (ETL)- Aprocess that Extracts information from internal and externaldatabases– Transforms the information using a common setof enterprise definitions– Load the information into a data warehouse




-Data Mart– A subset of data warehouse particular to theneeds of a given business unit (finance,marketing, accounting, etc.)

Information and Cleansing

Information cleansing or scrubbing– a process that weeds out and fixes or discardsinconsistent, incorrect, or incomplete information


• Information cleansing activities

Characteristics of High-Quality Information

Accuracy , Completeness , Consistency , Uniqueness , timeliness

Multidimensional Analysis

Database contain information in a series of twodimensionaltables• In a data warehouse and data mart, informationis multidimensional, it contains layers of columnsand rows– Dimension: a particular attribute of information

Cube

common term for the representation ofmultidimensional information

Business Intelligence

applications andtechnologies used to gather, provide access to, andanalyze data and information to support decisionmakingefforts

Business Intelligence 3 categories

Operational/ Tactical/ Strategic

Data mining

the process of analyzing data toextract information not offered by the raw dataalone

Data-mining tools

– use a variety of techniquesto find patterns and relationships in largevolumes of information– Classification– Estimation– Affinity grouping– Clustering

Data Mining Capabilities

Cluster analysis , Association detection , Statistical analysis

Cluster analysis

A technique used to divide an information set intomutually exclusive groups

Association detection

reveals the degree to which variables are related andthe nature and frequency of these relationships in theinformation

Statistical analysis

Perform such functions as information correlations,distributions, calculations, and variance analysis

Clustering Analysis 2

Figuring out groups in a way that– The members in one group are as close to each other aspossible– and those in different groups are as far as possible.




• Segment customer information and identify behavioraltraits

Association Detection (Market-Basket Analysis)

Determines groups ofproducts that customerstend to purchase together

Association Detection (Market-Basket Analysis )

Cross selling , Up selling

Cross selling:

the action ofpersuading customers to purchasean additional product or servicethat is in relation to the productthey are already purchasing

Up-selling:

the action ofpersuading customers to purchasemore expensive items, upgradesor other add-ons in attempt tomake amore profitable sale.

Statistical Analysis 2

– performs such functions asinformation correlations, distributions,calculations, and variance analysis

Forecast:

predictions made on the basis oftime-series information

Time-series information:

time-stampedinformation collected at a particular frequency

3 V’s of Big Data (IBM)

Volume: Data quantity


Velocity: Data speed


Variety: Data Types

How to mine precious information?

Examining large amount of data


– Aggregation and Statistics


• Data warehouse and OLAP


– Indexing, Searching, and Querying


• Keyword based search• Pattern matching– Knowledge discovery• Data Mining• Statistical Modeling


– Identification of hidden pattern, unknown correlations


– Better business decisions/ Effective marketing,customer satisfaction, increased revenue

Acquiring

- Obtain data


– Cleanse data


– Organize and relate data


– Catalog data

How Do BI Information Systems Support BI Activities?

- Analyzing: Reporting & Data mining


-

RFM analysis

- Recency: How recently a customer purchased an item


– Frequency: How frequently she purchases an item


– Monetary value: How much she spends each time

Interactive

Users allowed to change both the analysis and structure ofthe report


OLAP – online analytical processing


– OLAP slice and dice


– Drill down

How Do BI Information Systems Support BI Activities?

Publishing– Visualization

BI Success Factors

• Right Focus– Have to have a clear business problem


• Right People– Employees have to have an ability to understanddata and information


• Right Technology– Not all software tools are right for you.


• Right Culture– Data-driven decision making should beacceptable through the organization.

Collaboration Filtering

Predict what movies/books/.. a person may be interested in,on the basis of– Past preferences of the person– Other people with similar past preferences– The preferences of such people for a new movie/book/

Artificial Intelligence (AI)

Simulates human thinking and behavior, such asthe ability to reason and learn. Its ultimate goalis to build a system that can mimic humanintelligence. (ex. AlphaGo)

Categories of AI

Expert systems– Neural networks– Genetic algorithms– Intelligent agents

Types of Data

Structured , unstructured ,

Structured Data

Information stored in databases (Main Frame/SQL Server,Oracle, Access, Excel)– all records have same format as defined in the relational schema• Rows and columns• Data is consistent, uniform– Data mining friendly

Unstructured Data

Refers to information that doesn’t reside in a traditional rowcolumndatabase (not structured into “cells”)– All those things that can’t be so readily classified.– Text and multimedia content, e-mail messages, graphic images,videos, streaming, web pages, PDF files, Social media data(text, blogs, tweets, comments, tags), user-generated contentreviews

Text Mining

application of data mining to textual documents– The process of deriving high-quality information from text– Leveraging text improve decisions and predictions

Seven Types of Text Mining

Search and Information Retrieval (IR) , Document Clustering , Document classification, Web mining , Information Extraction (IE):, Natural Language Processing (NLP)

Web Mining

Discover useful information or knowledge fromthe Web hyperlink structure, page content, andusage data

Three types of web mining tasks

– Web structure mining


– Web content mining


– Web usage mining

Two main types of textual information

– Fact and Opinions

social network

is a social structure of people, related(directly or indirectly) to each other through a commonrelation or interest

Social network analysis (SNA)

is the study of socialnetworks to understand their structure and behavior