Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
42 Cards in this Set
- Front
- Back
What is the definition of statistics?
|
The science of conducting studies to collect, organize, summarize, analyze, and draw conclusions from data.
- no data = no statistics |
|
descriptive statistics
|
describes a situation (census); main form of deferential statistics
|
|
inferential statistics
|
make assumptions by generalizing dat from sample to population. (probability)
|
|
populations
|
all subjects (human and non human) included in study
|
|
sample
|
a group of subjects selected from population
|
|
qualitative variables
|
can be placed into categories (gender)
|
|
quantitative variables
|
numerical, can be ordered/ranked (age, weight, temp)
|
|
discrete variables
|
can be assigned values (countable)
|
|
continuous variables
|
cannot be assigned values - infinite number of values - fractions, decimals, i.e. temperature
|
|
nominal/category variables
|
categories that cannot be ranked
|
|
ordinal variables
|
categories that can be ranked) (S, M, L)
|
|
interval variables
|
rankable data w/ precise differences, no zero (IQ, temp, SAT score)
|
|
ratio variables
|
zero exists, "twice as much" rule
|
|
random sampling
|
subjects are selected by chance (i.e. lottery)
|
|
systematic sampling
|
select every kth subjcect
|
|
stratified sampling
|
divide population into groups (called strata), sample from each group randomly.
|
|
cluster sampling
|
divide popuation into "clusters", use entire clusters as samples
|
|
other methods: convenience sampling
|
use subjects only because they are convenient (voluntary participation) - not representative of entire population
|
|
observational study
|
researcher observes & draws conclusions - can never be used to prove anthing, can only suggest something or prove causality
|
|
experiemental study
|
researcher manipulates a variable to determine how the first variable affects a second variable (sometimes can prove)
|
|
frequency distribution
|
1) find highest/lowest value, find range, divide range by # of classes = width
2) start w/ appropriate low point, add width until # of classes reached 3) class boundaries: subtract 0.5 from each low end, add 0.5 to each high end 4) tally data, write frequency & cumulative frequency |
|
measures of central tendency
|
mean
median mode midrange (range/2) |
|
mean
|
balance point for data - sensitive to extremes/outliers
sample mean: X population mean: u |
|
median
|
halfway point for data - resistant to extremes/outliers
|
|
statistic
|
uses data values from a sample
|
|
parameter
|
uses data values from a population
|
|
measures of variance
|
variance
standard deviation coefficient of variation range |
|
variance
|
average of squares of distance each value is from mean
population variance - o squared sample variance - s squared |
|
standard deviation
|
square root of variance
population std dev - o sample std dev - s |
|
coefficient of variation
|
allows comparison between different data sets with different units
std dev/mean x 100% (larger the percent, the more variation there is in the data set) |
|
Rule of Thumb Estimate
|
rough estimate of std dev, provided distribution is unimodal and roughly symmetric
s = range/4 |
|
Empirical Rule/Normal Rule
|
Approximately:
- 68% of data values within one deviation of mean (mean +/- s) - 95% within 2 deviations - 99.7% within 3 deviations |
|
General Purpose Rule/Chebyshev's Theorem
|
between x-ks and x+ks, at least (1-1/ksquared)x100% of data will fall.
- find k using x-ks=lower given value - plug k value into (1-1/ksquared)x100% - this % of data lies between the two given values |
|
Measures of Position/Location
|
z-score
percentiles Interquartile Range (IQR) |
|
z-score, standard score
|
same as k in General Purpose Rule, represents # of std deviations that data value is away from mean
z = (x-xbar)/s |
|
percentile
|
divides data set into 100 equal groups
[(# of values below x + 0.5)/total # of values] x 100% c=(np)/100 n = total # of values p = given percentile |
|
Interquartile Range (IQR)
|
range of middle 50% of data
Q3 - Q1 = IQR >Analyze >Descriptive Statistics >Explore |
|
empirical probability
|
probabilities for outcomes based on observation
|
|
mutually exclusive events
|
events A and B cannot occur at the same time
P(A or B) = P(A) + P(B) |
|
not mutually exclusive
|
events A and B can occur at the same time
P(A or B) = P(A) + P(B) - P(A and B) |
|
independent events
|
A does not affect B (i.e. flip nickel and flip dime)
P(A and B) = P(A) x P(B) |
|
dependent events
|
outcome or occurence of A affects outcome or occurence of B so that probability changes
P(A and B) = P(A) x P(B|A) (conditional probability) When P(A)xP(B) does not equal P(A and B), then A and B are not independent, so there is a relationship between A and B |