categorical variable

variable expressed as categories; can be ordinal or discrete


nominal

categorical variable with values that cannot be ranked
ex. sex (male/female), blood type (A, B, AB, O) 

ordinal

categorical variable with values that can be ranked
ex. cloudiness (overcast, mostly cloudy, partly cloudy, sunny) 

quantitative variable

variable that can be expressed in amounts numbers; can be discrete or continuous


frequency

number of occurrences


descriptive statistics

statistics that describe the data


sample

collection of person or things on which one or more variables is measured


population

party on which inferences are being made


observational unit (case)

instance of the sample; specimen


statistic

numerical measure calculated from data


measures of center

measures meant to define the "center" or "typical value" of observations in a sample; include mean, median


measures of dispersion

include range, standard deviation, coefficient of variation


robust (resistant)

value of statistic relatively unaffected by changes in a small portion


range

difference between largest and smallest observations in a sample


coefficient of variation

the standard deviation expressed as a percentage of the mean


statistical inference

process of drawing conclusion about a population based on observations in a sample of that population


frequency distribution

display of frequency of each value in a data set
shape, center and spread of distribution is significant when discussing a set of data 

density curve

smoothcurve representation of frequency distribution
area under density curve between A and B = proportion of Y values between A and B 

p

population proportion


p̂ (phat)

sample proportion; and estimate of p (population proportion)


μ

population mean


σ

population standard deviation


degrees of freedom

n  1


ȳ (ybar)

sample mean


first quartile (Q₁)

median of data values in lower half or data set


third quartile (Q₃)

median of data values in upper half of data set


interquartile range (IQR)

difference between first and third quartiles
IQR = Q₃  Q₁ 

fivenumber summary

minimum, maximum, median, quartiles


sampling distribution

describes how close resemblance between sample and population is likely to be


sampling distribution of p̂

collection of probabilities of all various possible values of p̂


standard error

describes uncertainty in the mean of the data


standard deviation

describes dispersion of the data


Student's t distributions

theoretical continuous distributions used for construction of confidence intervals; shape depends on degrees of freedom (df)


normal distributions

distributions represented by normal curve: standardized symmetric bellshaped curve


binomial distributions

for binomial random variable Y: probability that n trials result in j successes


binomial random variable conditions (BInS)

Binary outcomes: two possible outcomes for each trial
Independent trials: outcomes of trials are independent of each other n is fixed Same value of p: probability of success on a single trial is same for all trials 