Study your flashcards anywhere!
Download the official Cram app for free >
 Shuffle
Toggle OnToggle Off
 Alphabetize
Toggle OnToggle Off
 Front First
Toggle OnToggle Off
 Both Sides
Toggle OnToggle Off
 Read
Toggle OnToggle Off
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
61 Cards in this Set
 Front
 Back
Data 
Recorded values whether numbers or labels, together with their context 

Data Table 
An arrangement of data in which each row represents a case and each column represents a variable 

Context 
The context ideally tells who was measured, what was measured, how the data were collected, where the data were collected, and when and why the study was performed 

Case 
An individual about whom or which we have data 

Respondent 
Someone who answers, or responds to, a survey 

Subject 
A human experimental unit. Also called a participant 

Participant 
A human experimental unit. Also called a subject 

Experimental Unit 
An individual in a study for which or for whom data values are recorded. Human experimental units are usually called subjects or participants 

Record 
Information about an individual in a database 

Sample 
A subset of a population, examined in hope of learning about the population 

Population 
The entire group of individuals or instances about whom we hope to learn 

Variable 
A variable holds information about the same characteristic for many cases 

Categorical Variable 
A variable that names categories with words or numerals 

Nominal Variable 
The term "nominal" can be applied to a variable whose values are used only to name categories 

Quantitative Variable 
A variable in which the numbers are values of measured quantities with units 

Units 
A quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams 

Identifier Variable 
A categorical variable that records a unique value for each case, used to name or identify it 

Ordinal Variable 
The term "ordinal" can be applied to a variable whose categorical values possess some kind of order 

Frequency Table 
A frequency table lists the categories in a categorical variable and gives the count (or percentage) of observations for each category 

Distribution 
The distribution of a variable gives the possible values of the variable and the relative frequency of each value 

Area Principle 
In a statistical display, each value should be represented by the same amount of area 

Bar Chart 
Bar Charts show a bar whose area represents the count of observations for each category of a categorical variable 

Pie Chart 
Pie charts show how a "whole" divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category 

Categorical Data Condition 
The methods in this chapter are appropriate for displaying and describing categorical data. Be careful not to use them with quantitative data 

Contingency Table 
A contingency table displays counts and, sometimes, percentages of individuals falling into named categories on two or more variables. The table categorizes the individuals on all variables at once to reveal possible patterns in one variable that may be contingent on the category of the other 

Marginal Distribution 
In a contingency table, the distribution of either variable alone is called the marginal distribution. The counts or percentages are the totals found in the margins of the table 

Conditional Distribution 
The distribution of a variable restricting the who to consider only a smaller group of individuals is called the conditional distribution 

Independence 
Variables are said to be independent if the conditional distribution of one variable is the same for each category of the other. We'll show how to check for independence in a later chapter 

Segmented Bar Chart 
A segmented bar chart displays the conditional distribution of a categorical variable within each category of another variable 

Simpson's Paradox 
When averages are taken across different groups, they can appear to contradict the overall averages. This is knows as "Simpson's Paradox" 

Distribution 
The distribution of a quantitative variable slices up all the possible values of the variable into equalwidth bins and gives the number of values falling into each bin 

Histogram 
A histogram uses adjacent bars to show the distribution of a quantitative variable. Each bar represents the frequency of values falling in each bin 

Gap 
A region of the distribution where there are no values 

StemandLeaf Display 
A display that shows quantitative data values in a way that sketches the distribution of the data. It's best described in detail by example 

Dotplot 
A dotplot graphs a dot for each case against a single axis 

Shape 
To describe the shape of a distribution, look for single vs. multiple modes, symmetry vs. skewness, outliers and gaps 

Mode 
A hump or local high point in the shape of the distribution of a variable. The apparent location of modes can change as the scale of a histogram is changed 

Unimodal 
Having one mode. This is a useful term for describing the shape of a histogram when it's generally moundshaped 

Bimodal 
Distributions with two modes 

Multimodal 
Distributions with more than two modes 

Uniform 
A distribution that doesn't appear to have any mode and in which all the bars of its histogram are approximately the same height 

Symmetric 
A distribution is symmetric if the two halves on either side of the center look approximately like mirror images of each other 

Tails 
The parts of a distribution that typically trail off on either side. Distributions can be characterized as having long tails or short tails 

Skewed 
A distribution is skewed if it's not symmetric and one tail stretches out farther than the other. Distributions are said to be skewed left when the longer tail stretches to the left, and skewed right when it goes to the right 

Outliers 
Outliers are extreme values that don't appear to belong with the rest of the data. They may be unusual values that deserve further investigation, or they may be just mistakes; there's no obvious way to tell. Don't delete outliers automatically  you have to think about them. Outliers can affect many statistical analyses, so you should always be alert for them. Boxplots display points more than 1.5 IQR from either end of the box individually, but this is just a ruleofthumb and not a definition of what is an outlier 

Center 
The place in the distribution of a variable that you'd point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number. Measures of center include the mean and median 

Median 
The median is the middle value, with half of the data above and half below it. If n is even, it is the average of the two middle values. It is usually paired with the IQR 

Spread 
A numerical summary of how tightly the values are clustered around the center. Measures of spread include the IQR and standard deviation 

Range 
The difference between the lowest and highest values in a data set 

Quartile 
The lower quartile is the value with a quarter of the data below it. The upper quartile has three quarters of data below it. The median and quartiles divide data into four parts with equal numbers of data values 

Percentile 
The ith percentile is the number that falls above i% of the data 

Interquartile Range 
The IQR is the difference between the first and third quartiles. It is usually reported along with the median 

5Number Summary 
The 5number summary of a distribution reports the minimum value, Q1, the median, Q3, and the maximum value 

Boxplot 
A boxplot displays the 5number summary as a central box with whiskers that extend to the nonoutlying data values. Boxplots are particularly effective for comparing groups and for displaying possible outliers 

Mean 
The mean is found y summing all the data and dividing by the count: It is usually paired with the standard deviation 

Resistant 
A calculated summary is said to be resistant if outliers have only a small effect on it 

Variance 
The variance is the sum of squared deviations from the mean, divided by the count minus 1, it is useful in calculations later in the book 

Standard deviation 
The standard deviation is the square root of the variance 

Comparing Distributions 
When comparing the distributions of several groups using histograms or stemandleaf displays, consider their shape, center, and spread 

Comparing Boxplots 
When comparing groups with boxplots compare the shapes, compare the medians, compare the IQRs, and check for possible outliers 

Timeplot 
A timeplot displays data that changes over time. Often successive values are connected with lines to show trends more clearly. Sometimes a smooth curve is added to the plot to help show longterm patterns and trends 