Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
25 Cards in this Set
- Front
- Back
Hash Table |
- A data structure used to implement an associative array, a structure that can map keys to values - Uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found - O(1) |
|
Hash Function |
- Function which, when applied to the key, produces an integer which can be used as an address in a hash table (index into an array of buckets) |
|
Hash Index |
- Organizes the search keys with their associated pointers into a hash file - Consists of a collection of buckets organized in an array - Through linked lists, multiple items can be associated with one index because of this Hash indices are often called buckets |
|
Operations for Creating a Hash Table |
1. Creating an array: The actual table where the data is stored 2. Creating the Hash Function |
|
Uniform Hash |
When the indices produced by the hash function (into an array) are equally likely to be generated |
|
Problems with Hash Function |
- Ideally, the hash function will assign each key to a unique bucket (index), but this ideal situation is rarely achievable in practice - Most hash table designs assume that hash collisions—different keys that are assigned by the hash function to the same bucket—will occur |
|
Complexity of Hashing |
- In many situations, hash tables turn out to be more efficientthan search trees or any other table lookup structure - For this reason, they are widely used in many kinds ofcomputer software, particularly for associative arrays,database indexing, caches, and sets |
|
Procedure for Hash Index |
- Puts a pointer to a record in the bucket based on the hash of that particular record’s key - A hash function “h” maps a search-key value to an address of a bucket - Commonly used hash function: hash value mod n_B where n_B is the number of buckets |
|
Procedure for Hash Function |
1. Creating the original hash code for the hash key 2. Transferring this number to a smaller number to be used as index for the key |
|
Other Applications of Hash Functions |
- Used to build caches for large data sets stored in slow media such as a slow Hard Drive. Frequently accessed data is saved in cache and can be accessed in a very short time - Computer operating systems use hash functions for fast access of key words and commands |
|
Good Hash Function |
- Create hash codes that permits the widest distribution of keys to various index values of the hash table. - It should uniformly distribute the keys |
|
Perfect Hash Function |
- Create such a unique code for each key in the data that the probability of two keys having the same code is zero. - Such a hash function is ideal but rarely does it exist |
|
Collision |
- When 2 keys hash to the same location in the hash table - Note: a good hash function distributes the elements uniformly over the hash table, this keeps the number of collisions within manageable limits |
|
Chaining |
- Collision Resolution Method - When there is a collision (key hashed to an occupied slot), the new key is stored in an overflow list. - Each incoming duplicate entry is linked at the end of this list chained to the index - Solution: make the size of the hash table (m) larger than the number of key value pairs (33%) |
|
Linear Probing |
- Collision Resolution Method - Will look onto the subsequent hash elements until the first free space is found - The size of the hash table (m) must be bigger than the number of elements so that some empty slots areavailable |
|
Hashing with Databases |
- Oracle does not have hash indices built in - MySQL does |
|
Indexing |
- Something to facilitate search - Used in Databases - Purpose: improve the speed of data retrieval |
|
Two Main types of Indexing |
- Ordered Index - Hash Index |
|
Ordered Index: Primary Index |
- If the file containing the records is sequentially ordered, the index whose search key specifies the sequential order of the file |
|
Ordered Index: Secondary Index |
- Indices whose search key specifies an order different from the sequential order of the file - Improve the performance of queries on nonprimary keys - Impose serious overhead on database modification: whenever a file is updated, every index must be updated |
|
Dense Index |
- An index record appears for every search key value in file - This record contains search key value and a pointer to the actual record |
|
Sparse Index |
- Created only for some of the records - To locate a record, we find the index record with the largest search key value less than or equal to the search key value we are looking for - We start at that record pointed to by the index record, and proceed along the pointers in the file (that is, sequentially) until11 we find the desired record |
|
Dense Indicies |
- Don’t give us much of a bonus on an ordered table –a primary index that is dense on an ordered table is redundant! - A primary index on an ordered table is always sparse - Otherwise (unordered) use dense or secondary indices - Dense indices are faster in general |
|
How much indexing to do? |
- Read-heavy DBs – can index a lot (if got the memory to do it) - Write-heavy DBs – index sparingly! (take a balanced approach) - Write-ONLY DBs – one or no index (e.g. log table) |
|
Two-Level Sparse Index |
- If index is too large to be kept in main memory, one search results in several disk reads - If there are no overflow blocks in the index, we can use binary search - If index has overflow blocks, then sequential search typically used - Solution: Construct a sparse index on the index |