Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
33 Cards in this Set
- Front
- Back
- 3rd side (hint)
What are the 5 standards of BI
|
hardware, software, data, procedures, people
|
|
|
how man terabytes are in a petabyte
|
200
|
|
|
Software is a compenent of a BI system what is that called
|
BI Application
|
|
|
3 primary actives of BI Process
|
Acquire data, perform analysis, publish results
|
|
|
Acquiring data has four characteristics
|
obtain, cleanse, organize/relate, catalog
|
|
|
perform anaylsis has 3 characteristics
|
reporting, data mining, big data
|
|
|
publish results has 3 characteristics
|
print, web servers, report servers, automation
|
|
|
What are some different Business Intelligence systems?
|
Reporting systems, Data Mining Systems, Knowledge Management Systems ,Expert Systems
|
|
|
What do reporting systems do?
|
integrate data from multiple sources, and then process data by sorting, grouping, summing, averaging, and comparing
|
|
|
What are data mining systems?
|
Business Intel systems that process data using sophisticated statistical techniques, such as regression analysis and decision tree analysis
|
|
|
What are some problems when using raw data?
|
dirty, miss values, inconsistent, too much data
|
|
|
An exbyte is how much
|
1000 pertabytes
|
|
|
What do Knowledge Management systems do?
|
Create value from intellectual capital by collecting and sharing human knowledge of products, product uses, best practices, and other critical knowledge with employers, managers, customers, suppliers and those who need it.
|
|
|
Granularity: refers to the
|
level of detail represented by the data (can be to fine or coarse)
|
|
|
data mart is what
|
A data collection that is created to address the needs of a popular business function, problem, or opportunity. (data warehouse is the distributor in a supply chain and the data mart is like a retail store in the supply chain)
|
A data mart is:(expensive)
|
|
Data mining is specifically what
|
The application of statistical techniques to find patterns and relationships among data and to classify and predict
|
Data mining specifically is: (2 categories)
|
|
Unsupervised data mining is when:
|
Analysts don't create a model or hypothesis before running the analysis, just observe results
|
|
|
What is a cluster analysis?
|
Where statistical techniques identify groups of entries that have similar characteristics
|
|
|
Supervised data mining is where?
|
Data miners develop a model prior to the analysis and apply statistical techniques to data to estimate parameters of the model.
|
|
|
What are neutral networks?
|
A popular supervised data mining technique used to predict values and make classifications.
|
|
|
What happens in MapReduce
|
harnesses the power of 1000's of computers working in parallel. - big data is broken into pieces and processors search pieces for something
|
|
|
big data is broken into pieces and processors search pieces for something
|
Map Phase
|
|
|
What is Hadoop
|
open source program supported by Apache foundation manages 1000's of comps and implement Map Reduce
|
|
|
What is Big Data
|
describes data collections that are characterized by huge(everything) volume, velocity, variety
|
|
|
Structured data is what kind of data
|
data in the form of rows and columns
|
spreadsheet
|
|
Push publishing is what
|
delivers business intelligence to users without any request from the user
|
|
|
Pull publishing is what
|
requires the user to request BI results
|
|
|
Veracity describes what
|
how inaccurate or accurate the data is
|
|
|
Supervised data minning is what
|
create a model, took the model and ran it on another set of data and divide the data into dove set, and then run it and train the model and refine it on tests set then finalize it on the real data. 87% target was right when a women was pregnant
|
|
|
data mart is what
|
is a subset of a data warehouse, used to pick out commonalities. (25 items)Target did this to figure out when a women was pregnant based on the lotion she was buying
|
|
|
Unsupervised data mining
|
how you cluster data together, take data from bottom up, data is static, data is normally in a data warehouse
|
|
|
4 examples of servers
|
Email, Web server SharpointBI server
|
|
|
Key aspects of Data Warehouses are: (expensive)
|
Stores Data, Provides Data for BI Has Metadata, an include purchased data, Static
|
|