Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
65 Cards in this Set
- Front
- Back
a. Determinant |
The attribute on the left-hand side of the arrow in a functional dependency. |
|
b. Functional dependency
|
A constraint between two attributes or two sets of attributes.
|
|
c. Transitive dependency
|
A functional dependency between two (or more) nonkey attributes.
|
|
d. Recursive foreign key |
A foreign key in a relation that references the primary key values of that same relation.
|
|
e. Normalization |
The process of decomposing relations with anomalies to produce smaller, well-structured relations.
|
|
f. Composite key |
A primary key that consists of more than one attribute. |
|
g. Relation
|
A named, two-dimensional table of data.
|
|
h. Normal form
|
A state of a relation that results from applying simple rules regarding functional dependencies (or relationships between attributes) to that relation. |
|
i. Partial functional dependency |
A functional dependency in which one or more nonkey attributes (such as Name) are functionally dependent on part (but not all) of the primary key. |
|
j. Enterprise key |
A primary key whose value is unique across all relations. |
|
k. Surrogate primary key: |
A serial number or other system assigned primary key for a relation. |
|
well-structured relation
|
contains little redundancy
|
|
Six important properties of relations are:
|
a. Each relation in a database has a unique name. |
|
5. Describe two properties that must be satisfied by candidate keys: |
a. Unique identification: For every row, the value of the key must uniquely identify that row. |
|
6. Three types of anomalies in tables:
|
a. Insertion anomaly: A new row cannot be inserted unless all primary key values are supplied. |
|
9. Describe how the following components of an E-R diagram are transformed into relations: |
a. Regular entity type: Each entity type is transformed into a simple relation. Each simple attribute of the entity type becomes an attribute of the relation. |
|
10. Four typical problems in merging relations:
|
a. Synonyms: Two (or more) attributes have different names but the same meaning. Solution: Convince users to standardize on a single name. |
|
1NF |
Multivalued attributes removed |
|
2NF |
contains no partial functional dependencies |
|
3NF |
transitive dependencies eliminated |
|
Contrast
Normal form; normalization |
Normal form is a state of a particular relation regarding functional dependencies, while normalization is the process of decomposing relations with anomalies to produce smaller, well-structured relations. |
|
Contrast
Candidate key; primary key |
A primary key is an attribute (or combination of attributes) that uniquely identifies a row in a relation. When a relation has more than one such attribute (or combination of attributes), each is called a candidate key. The primary key is then the one chosen by users to uniquely identify the rows in the relation. |
|
Contrast
Partial dependency; transitive dependency |
A partial functional dependency exists when a nonkey attribute is functionally dependent on part (but not all) of a composite primary key; a transitive dependency is a functional dependency between two or more nonkey attributes. |
|
Contrast
Composite key; recursive foreign key |
A composite key is a primary key that consists of more than one attribute, while a recursive foreign key is a foreign key in a relation that references the primary key values of that same relation. |
|
Contrast
Determinant; candidate key |
A determinant is the attribute on the left-hand side of the arrow in a functional dependency, while a candidate key uniquely identifies a row in a relation. |
|
Contrast
Foreign key; primary key |
A primary key uniquely identifies each row in a relation, while a foreign key is a primary key in another table. |
|
Contrast
Enterprise key; surrogate primary key
|
An enterprise key is a primary key whose value is unique across all relations in the whole database and is likely to hold no business meaning. A surrogate primary key is a primary key whose value is a serial number or other system assigned value and is unique to the relation.
|
|
Three conditions that imply a relation is in second normal form: |
|
|
12. Integrity constraints enforced in SQL CREATE TABLE commands: |
|
|
File organization |
A technique for physically arranging the records of a file on secondary storage devices. |
|
Sequential file organization |
Records in the file are stored in sequence according to a primary key value. |
|
Indexed file organization |
Records are either stored sequentially or non-sequentially, and an index is created that allows software to locate individual records. |
|
Hashing file organization |
The address for each record is determined using a hashing algorithm. |
|
Denormalization |
The process of transforming normalized relations into unnormalized physical record specifications. |
|
Composite key |
A key made up of more than one column. |
|
Secondary key |
One or a combination of fields for which more than one record may have the same combination of values. |
|
Data type |
Each unit of a detailed coding scheme recognized by system software, such as a DBMS, for representing organizational data. |
|
Bitmap index |
A table of bits in which each row represents the distinct values of a key and each column is a bit, which when on indicates that the record for that bit column position has the associated field value. |
|
Redundant Arrays of Inexpensive Disks (RAID) |
A set, or array, of physical disk drives that appear to the database user (and programs) as if they form one large, logical storage unit. |
|
Join index |
An index on columns from two or more tables that come from the same domain of values. |
|
Stripe |
The set of pages on all disks in a RAID that are the same relative distance from the beginning of the disk drive. |
|
Explain plan
|
A command available in most database management systems which displays how the query optimizer will access indexes, use parallel servers and join tables to prepare query results. |
|
Horizontal partitioning; vertical partitioning |
Horizontal partitioning is very similar to creating a supertype/subtype relationship because different types of the entity (where the subtype discriminator is the field used for segregating rows) are involved in different relationships, hence different processing. Neither horizontal nor vertical partitioning prohibits the ability to treat the original relation as a whole. |
|
Physical file; tablespace |
A physical file is a named portion of secondary memory (magnetic tape, hard disk) allocated for the purpose of storing records; a tablespace is a named set of disk storage elements in which physical files for database tables may be stored. |
|
Physical record; physical file |
A physical file is a named portion of secondary memory (a magnetic tape or hard disk) allocated for the purpose of storing physical records. Physical records of several types can be clustered together into one physical file in order to place records frequently used together close to one another in secondary memory. |
|
Page; physical record |
Often a page will store multiple physical records. Depending on the computer system, a lengthy physical record may or may not be allowed to span two pages. |
|
Secondary key; primary key
|
A secondary key is one or a combination of fields for which more than one record may have the same combination of values; the primary key is one or a combination of fields for which every record has a unique value. Hence, the primary key is a unique identifier for a row. |
|
Three major inputs to physical design:
|
|
|
5. Key decisions in physical database design:
|
a. Choosing the storage format (called data type) for each attribute from the logical data model: the format is chosen to minimize storage space and to maximize data integrity. |
|
7. Developing field specifications:
|
a. Define the data type used to represent values of the field. |
|
8. Selecting a field data type:
|
a. Minimize storage space |
|
9. Coding or compressing field values:
|
Where attributes have a sparse set of values or a volume so large that considerable storage space will be consumed, possibilities for coding or compressing field values should be considered. Large data fields mean that data are further apart, which yields slower data processing. Where the set of valid values is small, translation into a code that requires less space is a possibility. Data compression techniques also use coding to reduce the storage space required for commonly recurring patterns of data. |
|
10. Controlling field integrity: |
a. Specify default values. |
|
11. Three ways to handle missing field values:
|
a. Substitute an estimate of the missing value: for example, for a missing sales value when computing monthly product sales, use a formula involving the mean of the existing monthly sales values for that product indexed by total sales for that month across all products. Such estimates must be marked so users know that these are not actual values. |
|
12. Effect of normalizing relations on physical record storage:
|
One goal of physical record design is efficient use of storage space. In most cases, the second goal of physical record design—efficient data processing—dominates the design process. Efficient processing of data, just like efficient accessing of books in a library, depends on how close together related data (or books) are. Often, all the attributes that appear within a relation are not used together, and data from different relations are needed together to answer a query or produce a report. Thus, although normalized relations solve data maintenance anomalies, normalized relations, if implemented one for one as physical records, may not yield efficient data processing. |
|
13. Situations that suggest the possibility of denormalization:
|
a. Two entities with a one-to-one relationship: Even if one of the entities is an optional participant, if the matching entity exists most of the time, then it may be wise to combine these two relations into one record definition (especially if the access frequency between these two entity types is high). Figure 6-3 shows student data with optional data from a standard scholarship application a student may complete. In this case, one record could be formed with four fields from the STUDENT and SCHOLARSHIP APPLICATION normalized relations. (Note: In this case, fields from the optional entity must have null values allowed.) |
|
14. Advantages and disadvantages of horizontal and vertical partitioning:
|
a. Efficiency: Data used together are stored close to one another and separate from data not used together.
|
|
15. Seven criteria for selecting a file organization: |
a. Data retrieval speed |
|
16. The desirability of a bitmap index:
|
A bitmap is ideal for attributes that have few possible values, which is not true for conventional tree indexes. A bitmap also often requires less storage space (possibly as little as 25 percent) than a conventional tree index (Schumacher, 1997), but for an attribute with many distinct values, a bitmap index can exceed the storage space of a conventional tree index. One bitmap can be used for multiple keys in order to perform searches on elements that would satisfy more than one condition at a time. |
|
17. The benefits of a hash index table:
|
Using a hashing algorithm allows for rows stored independently of the address, so that whatever file organization makes sense can be used for storage. Also, because index tables are much smaller than a data table, the index can be more easily designed to reduce the likelihood of key collisions or overflows. |
|
18. The purpose of clustering data in a file:
|
Some database systems allow physical files to contain records with different structures, e.g., rows from different tables may be stored in the same disk area. This clustering reduces the time to access related records compared to the normal allocation of different files to different areas of a disk. Time is reduced since related records will be closer to each other than if the records are stored in separate files in separate areas of the disk. |
|
19. Seven rules of thumb for choosing indexes for relational databases.
|
a. Indexes are more useful on larger tables. |
|
20. Two views of multidimensional databases:
|
a. Multidimensional table: Each cell contains one or more simple attributes, and the dimensions are ways to categorize the raw data. These categories, or dimensions, are the factors on which users want to summarize or segment the data, such as time periods, geography, lines of business, or people. A cell contains data relevant to the intersection of all of its dimension values. For example, a cell might hold the number of units sold attribute for a given time period, location, line of business, and salesperson. |
|
21. How can the use of the EXPLAIN command help in writing a more efficient query:
|
EXPLAIN plan will show exactly how a query will be submitted to the DBMS for processing. We can see indexes used, servers used, and how tables will be joined. Different execution plans for the query written in several different ways will help identify the least-cost execution for a desired query. The query with the best performance can then be chosen. |
|
22. Four options for optimizing query performance:
|
a. The most common approach is to replicate the query so that each copy works against a portion of the database, usually a horizontal partition (sets of rows). The partitions need to be defined in advance by the database designer. The same query is run against each portion in parallel on separate processors, and the intermediate results from each processor are combined to create the final query result as if the query were run against the whole database. |