• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/71

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

71 Cards in this Set

  • Front
  • Back
variables
characteristics recorded about each individual
quantitative: variables in numbers
categorical: names the category
case
individual about whom or which we have data
frequency table
organizing counts
relative frequency table
similar to a normal frequency table, but this give percentages
distribution
names the different possible categories and how frequently they may occur
area principle
the area occupied by a part of the graph should correspond to the magnitude of the value it represents
bar chart
displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison
pie chart
shows the relative portion/ the whole group of cases as a circle.
contingency table
used when looking at two categorical variables together. shows how the individuals are distributed along each variable
marginal distribution
th distribution of one of the variables in a contingency table
conditional distribution
a distribution of one variable for only those individuals satisfying some condition on another variable
independent
in a contingency table when the distribution of one variable is the same for all categories of another
segmented bar chart
used rather than a bar chart. each bar represents a "whole" and is divided into the separate parts
Simpson's paradox
when averages are taken across different groups, they can appear contradictory
distribution
the bins and the counts in each bin give the distribution for a quantitative variable
histogram
shows the distribution as the heights of bars
relative frequency histogram
displays the percentage of cases in each bin instead of the count
stem-and-leaf displays
contains all the information found in a histogram and satisfies the area principle ad show the distribution. Preserves the individual data values
dotplot
a simple display the places a dot for each case in the data
Describing distribution of a Histogram
unimodal: one main peak
bimodal: two peaks
multimodal: three of more peaks
uniform: all bars are approximately the same height
symmetric: fold along the middle and the sides will match
tails: thinner ends of a distribution
skewed: if a tail stretches out really far than the histogram is skewed to the side of the longer tail
outliers: stragglers that stand off away from the body of the distribution
timeplot
shows data that change over time
center of distribution
median: the middle value that divides the histogram into two equal areas

(n+1)/2

n is the number of numbers
spread
describes the distribution numerically which measures the spread along with its center
range
max-min
quartiles
split the data at the median and find the medians of those two halves
interquartile range/IQR
upper quartile-lower quartile
percentiles
the lower and upper quartiles are aka the 25th and the 75th percentiles of the data
the five-number-summary
Median
Quartiles : Q1 and Q3
Maximum
Minimum
mean
add up the numbers and divided by n

the point in which the histogram would balance

for skewed data, use the median instead
deviation
how far the data value is from the mean
standard deviation
1) you find the mean (ybar)
2) subtract the mean from each value
3) square each value
4) add these numbers and divide by n-1
5) take the square root of this number
z-scores
how many standard deviations something is from the mean

z= (y-ybar)/s

s is the standard deviation
normal model
appropriate for distributions whose shapes are unimodal and roughly symmetric

mean=0
SD=1
standardized value
found by subtracting the mean and dividing by the standard deviation
68-95-99.7 Rule
In a Normal model, 68% of the values fall within one standard deviation of the mean, 95% fall within two standard deviations of the mean, and 99.7% fall within three standard deviations of the mean
normal percentile
gives the percentage of values in a standard normal distribution found at that z-score or below
normal probability plot
if the plot is normal, then a diagonal straight line should form
scatterplot
shows the relationship between two quantitative variables measured on the same cases
direction of scatterplots
runs from upper left to the lower right- negative
lower left to upper right-positive
response and explanatory variable
explanatory is on the x-axis
response is on the y-axis
correlation
strength of the linear association between two quantitative variables

between -1 and 1
how to find the correlation coefficient
1) (x-xbar)(y-ybar) do this for each pair
2) divide this sum by the product of (n-1) x slilx x slily
predicted value
the value of y-hat found for each x-value in the data. Found by substituting the x-value in the regression equation.
residual
the difference between the observed value and its associated predicted value

y - y-hat= residual
linear model
y-hat = b0 + b1x

y-hat is the predicted value
b0 is the y-intercept
b1 is the slope
R-squared
the correlation between y and x
the fraction of variability accounted for by the least squares regression on c
how successful the regression is in linearly relating y to x
least squares
specifies the unique line that minimizes the variance of the residuals or, the sum of the squared residuals
good linear model?
1) scatterplot is linear
2) calculate r. if r is close to 1 or -1 then it is a good model
3) look at residual plot and see no pattern
leverage
outliers that exert high leverage on a linear model. Pulls the line close, so that they can have a large effect on the line
Influential point
a point that has high leverage
random
an event is random if we know what outcomes could happen, but not which particular values did or will happen
simulation
models random events by using random numbers to specify event outcomes with relative frequencies that correspond to the true-world relative frequencies we are trying to model
Census Sample
survey of the whole population
SRS
sample size n in which n elements in the population has an equal chance of selection
Convenience Sample
biased. Individuals who are conveniently available
Systematic Sampling
every nth number
Stratified Sample
divides the population into strata (groups). Into homogeneous strata within each group do a SRS
Cluster Sampling
divide into heterogeneous groups and take a SRS
Multistage Sampling
Sampling done in stages
Forms of Bias
1) Response Bias: wording of the question
2) Non-Response Bias: you are present, but chose not to respond
3) Voluntary Response: responding because you have a strong feeling
4) Undercoverage: you are not present and hence not represented in the sample
Observational Study
no manipulation of factors has been employed
Retrospective Study
subjects are selected and then their previous conditions or behaviors are determined
Prospective Study
observational study in which subjects are followed to observe future outcomes
experiment
manipulates factor levels to create treatments and compare the responses of the subject groups across treatment levels
Principles of Experimental Design
1) Control: areas we know may have an effect, but are not factors being studied
2) Randomize
3) Replication
Block: reduces variation (not required)
Statistically Significant
an observed difference is too large for us to believe that it is likely to have occurred naturally
Control Group
group given the placebo or baseline treatment
single-blind
either those who can influence the results do not know or those who evaluate the results dont know
double-blind
when everyone is blinded
block
when groups of experimental units are similar, we gather them into blocks. this isolate variability so that we may see the differences more clearly
confounded
the effects of two or more factors are associated with eachother