Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
20 Cards in this Set
- Front
- Back
List 4 reasons why BI is important.
|
1) Single version of the truth (single source of truth)
2) Improve decision making 3) Provides data infrastructure for BI 4) Support corporate initiatives (performance mgmt, business 2 customer, business 2 business e-commerce, and CRM. |
|
T/F: Data warehouse is simply a repository of data for decision support purposes
|
False. Data warehousing has moved from being simply a repository of data for decision support purposes to being a critical component of many strategic initiatives.
|
|
T/F: BI targets require different BI environments.
|
True. For example, developing a single or a few BI applications may require only a data mart rather than a data warehouse.
|
|
Define a Data warehouse
|
A data warehouse is a collection of data created to support decision-making applications.
At their core, a data warehouse and data warehousing are simple concepts. |
|
What are some characteristics of a data warehouse? List 4.
|
1) Subject oriented - data is organized around a subject such as sales, products, etc...
2) Integrated - data is integrated to provide comprehensive view 3) Time variant - historical data is maintained 4) Nonvolatile - data not updated by users |
|
What is a data mart?
|
A scaled down version of a data warehouse.
"Scaled down" -- includes fewer subject areas (less data and fewer users). Data marts store data for a limited number of subject areas, such as marketing and sales data. |
|
Define the two types of data marts
|
1) Independent data mart is created directly from source systems
2) Dependent data mart - populated with data from a data warehouse. |
|
Fill in the blank: _______ are used to support specific applications
|
Data marts .... "are..."
|
|
Name 4 data warehouse architectures
|
1) Independent data marts
2) enterprise data warehouse (inmon vs. hub and spoke vs. centralized) 3) Data mart bus (kimball) 4) Federated |
|
3 Cons of a independent data mart
|
1) Does not provide a single version of the truth for the entire organization
2) Inconsistent data definitions and inconsistent dimensions and measures that make it difficult to run distributed queries across the marts 3) Costly & Time consuming to maintain |
|
Name 2 enterprise data warehouse architectures
|
"Hub and Spoke" created by Inmon and the "Data Mart Bus" architecture created by Ralph Kimball
|
|
How do enterprise data warehouses work?
|
1) Analyze enterprise-level data requirements
2) Focus on building scalable infrastructure 3) Subject-area by subject-area development: Develop architecture in iterations 4) Data stored in warehouse in 3rd normal form. 5) Create dependent data marts that source data from warehouse. |
|
T/F: Independent data marts obtain their data from the warehouse and thereby establishing a single version of the truth.
|
False: Dependent obtain data from the warehouse (independent gets it from source systems)
|
|
What format does dependent data marts store data as?
|
Multidimensional or Star schema.
The use of star schemas is recommended because it corresponds to how people think and their fast response time to queries. |
|
List some data sources:
|
Primarily legacy and operational systems.
Also external data often purchased from 3rd party resources |
|
List some data types:
|
Real time Numerical data and new data: text, RFID, social media) and Big data
|
|
True or false: the average amount of source systems companies extract data from is close to 50
|
False: It is not unusual to extract data from over 100 source systems. While the technology is available to store structured and unstructured data together, the reality is that warehouse data is almost exclusively structured -- numerical with simple textual identifiers.
|
|
What is ETL? and what is it known as?
|
Extraction, Transformation, and Loading processes.
Known as the "plumbing or pick and shovel" work of data warehousing. |
|
What does ETL do? What does it require?
|
ETL moves data from source to target databases.
It is a very costly and time consuming |
|
What is metadata?
|
Data about data.
Must have both business metadata and technical metadata to support both the business and technical users. |