Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
SQL, Storage, Indexing, Design, Tuning

Sql, Storage, Indexing, Design, Tuning

by acesaro, Nov. 2005

Subjects: database design indexing sql storage tuning

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/22

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

22 Cards in this Set

Front
Back

	Block	The amount of data read or written in one I/O operation.
	Page	Same as a block
	Blocking factor	The number of physical records per block.
	Inputs of physical design phase	Logical (implementation) model Documentation/Definitions DBMS Characteristics Response time requirements Security, Backup, Recovery, Retention, Integrity requirements
	Outputs of physical design phase	Produces a Description of the Implementation of the Database on Secondary Storage. Describes the storage structures and access methods used to achieve efficient access to the data.
	Primary & Secondary index differences	Primary: An ordered, fixed-length file that stores the primary key and a pointer. Each record in the index file corresponds to each block in the ordered data file. Sparse / nondense index Secondary: A secondary index is an ordered file that stores the nonordering field of a data file and a pointer. If the indexing field is a secondary or candidate key, then the index is dense. If the indexing field is a nonkey field, then the index is sparse. Introduces the idea of adding a level of indirection
	Clustered vs. Unclustered indexes	If order of data records is the same as, or `close to’, order of data entries, then called clustered index. A file can be clustered on at most one search key. Cost of retrieving data records through index varies greatly based on whether index is clustered or not!
	Describe hash based indexes	Hash function calculates the address of the page on which the record is stored. The field that’s used in the hash function is called the hash field. Called a hash key if the hash field is also a key field. Good for equality searches Not good for range searches
	Describe B-tree Indexes	If the root is not a leaf node, it must have at least two children. For a tree of order n, each node (other than root/leaf nodes) must have between n/2 and n pointers and children. … (other rules) Tree must always be balanced Every path from the root to a leaf must have same length.  This is what matters. This means that it always takes about the same time to access any record. Leaf pages contain data entries, and are chained (prev & next) Non-leaf pages contain index entries and direct searches:
	Tradeoffs in deciding which indexes to create	Indexes can make queries go faster, updates slower but require disk space, too.
	Index selection guidelines	Attributes in WHERE clause are candidates for index keys. Exact match condition suggests hash index. Range query suggests tree index. Clustering is especially useful for range queries; can also help on equality queries if there are many duplicates. Multi-attribute search keys should be considered when a WHERE clause contains several conditions. Order of attributes is important for range queries. Such indexes can sometimes enable index-only strategies for important queries. For index-only strategies, clustering is not important! Try to choose indexes that benefit as many queries as possible. Since only one index can be clustered per relation, choose it based on important queries that would benefit the most from clustering.
	Rules of thumb for indexes	Index primary key fields DBMS may do this automatically Composite keys require composite indexes Index foreign key fields Index other fields frequently used in: WHERE clauses GROUP BY clauses ORDER BY clauses Consider composite indexes when fields are used together in conditions
	Advantages/Disadvantages of Clustered Indexes	Advantages: Efficient storage; faster searches than heap Faster inserts/deletes than sorted file Disadvantages: Only one clustering factor per file
	Advantages/Disadvantages of Unclustered tree indexes	Advantages: Fast searches, fast inserts/deletes Disadvantages: Slow scans and range searches with many matches
	Advantages/Disadvantages of Hashed indexes	Advantages: Same as unclustered, but faster on equality searches Disadvantages: Does not support range searches
	Four steps of transaction analysis	1. Create xact/table cross-reference and/or usage map. 2. Determine which tables are most frequently accessed by xacts 3. Analyze usage data for selected xacts that involve the most frequently-used tables. 4. Design to improve performance of key transactions (indexes)
	Which transactions should be analyzed?	May not be able to analyze all xacts Focus on “important” transactions 80/20 rule 80% of data access comes from 20% of xacts Also consider high-priority xacts
	What information is contained in a cross reference table	Transactions and relations and the operations performed for each transaction
	What information should you consider when performing a detailed analysis of a transaction	Tables and attributes used Type of access (select, update) Avoid indexes for update xacts Attributes used in conditional statements Range vs. equality (cluster vs. hash index) Attributes used in joins Attributes used in order/group by Expected frequency of query Performance goals: High freq/priority queries are priority candidates for indexes.
	What is denormalization, why would it be considered	Introduce additional redundancy Improves performance May lead to anomolies
	What is a view? Why are views used?	View Virtual table based on base tables in the DB Only the definition really exists in the DB Provide a limited “view” of the DB “Pre-do” joins Limit tables Limit columns Way to implement logical independence Used for security
	Name and briefly describe four design goals for distributed DBMS'	Local transparency users don’t need to know where data is stored Replication transparency user can treat data as if it is stored at only one site, even if it is replicated at several sites Failure transparency a transaction is committed or not committed commit protocol Concurrency transparency it appears to the user as their transaction is the only activity on the system

Share This Flashcard Set

Set the Language

Sql, Storage, Indexing, Design, Tuning

Add to Folders

Upgrade to Cram Premium

Card Range To Study

22 Cards in this Set