Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
31 Cards in this Set
- Front
- Back
- 3rd side (hint)
categorical data defintion |
data with labels as values (non-numerical) |
ex: male/female, color, dog breed, class subject, etc |
|
quantitative data defintion |
data that is of a measurable quantity |
ex: population, grams, seconds, height, etc. |
|
ogive graph |
cumulative frequency distribution graph |
|
|
symmetric graph |
has peaks and dips that are centered |
|
|
skewed right |
a graph where the mean is greater than the median |
|
|
skewed left |
a graph where the mean is less than the median |
|
|
measures of center
|
mean, median, mode |
|
|
measures of spread |
q1, q3, IQR, range, standard deviation |
|
|
measures of shape |
unimodal, bimodal, multimodal, skewed, uniform, bell-shaped |
|
|
mean |
the average: (x + y) / n |
|
|
median |
the middle value in an ordered data list |
|
|
interquartile range |
IQR = q3 − q1 |
|
|
outlier |
lower: q1 − 1.5 (IQR) higher: q3 + 1.5 (IQR) |
|
|
resistant |
relatively unaffected by outliers |
|
|
empirical rule |
68 - 95 - 99.7 rule |
|
|
standard normal distribution z-score |
z-score = (x − µ) / δ |
denotes how many standard deviations away from the mean |
|
rules of means and variances |
y = bx +a δ²y = b² x² |
only the mean is affected by addition/subtraction variance = δ² |
|
correlation of determination |
gives proportion of how much variation can be determined by the model |
r² value |
|
correlation coefficient |
measures strength of association (from -1 to 0 to 1) |
r value |
|
residual |
residual = observed − predicted y − λ |
|
|
description of scatter plot |
form, direction, strength |
|
|
correlation |
a linear relationship between two variables (only linear relationship) |
association = two variable relationship |
|
ways to determine appropriateness
|
look at the scatterplot of x vs. y look at the plot of residual y vs. x, looking for a pattern or random distribution of points |
whether or not the relationship should be used |
|
ways to determine goodness of fit |
look at the correlation coefficient look at the coefficient of determination |
how accurately it fits the model numerically |
|
what relationship needs to be linear for an exponential relationship |
log (y) vs. x |
log (y) = a + bx |
|
what relationship needs to be linear for a power relationship |
log (y) vs. log (x) |
log (y) - a + b*log (x) |
|
what relationship needs to be linear for a quadratic relationship |
√(y) vs. x
|
|
|
marginal frequencies |
total frequencies for each row or column in a two way table |
|
|
marginal distributions |
marginal frequencies / table total |
the bigger version of conditional relative frequencies |
|
conditional relative frequencies |
individual cell count / row or column total |
the smaller version of marginal distributions |
|
notation for conditional distributions |
P (A|B) = x |
given A and B as categories in a two way table |