Statistics

the science of collecting, organizing, and interpreting data


population

the larger group we are attempting to study


Population parameters

the true numbers that describe the population abstract we don't actually know what they are yet (estimation) we are trying to get close to these


Sample

the smaller subset of the population that we actually study, measure, survey


sample statistics

the stats that we actually measure, calculate from the sample Ex. mean, median, mode etc.


5 Steps to a Statistical Study

State goal
Decide sample size Collect Calculate Conclude 

Simple Random Sampling

choose items so that every sample has an equal chance of being selected (Best)


systematic sampling

simple system choosing every kth number


convenience sampling

samples that are convenient to select


stratified sampling

partition the population into at least two strata then draw a sample from each


bias

if the sample doesn't reflect the population


selection

the sample we pick reflects our bias


participation

people get to decide whether to be part of the study


observational study

researchers observe or measure characteristics but do not attempt to influence or modify them


experimental study

apply a treatment to some or all of the sample members and then look to see if it has any effects


case control study

combo of observational and experimental, we don't conduct the experiment, we just observe a natural one, and then ask questions about it. the people are already engaging in a specific behavior


control group

group in experiment that does not receive treatment


treatment group

group in experiment that does receive treatment


placebo

given to control group (no active ingredient, sugar pill)


placebo effect

when people think/believe they are getting treated so they (believe they) are getting better (sometimes actual physical improvement)


singleblind experiment

participants don't know if they are getting placebo, but experienters do


doubleblind

only third party knows if there is a placebo


margin of error

an attempt to measure and describe how wrong we are if we add/subtract the MOE to a sample stat we get the confidence interval


confidence interval

where the true population parameter probably falls.
Ex. given 1.5% MOE 34% of all women drivers... Take 34+1.5 and 341.5 which gives you a range of 32.5 to 35.5 this range is the confidence interval 

Qualitative Data

qualities, characteristics, adjectives, descriptions, ex. months, colors


Quantitative Data

Quantities, numbers, amounts


Frequency

numbers, counts (not an answer, just a count of what they said, turn tallies into frequencies)


Cumulative Frequency

add each frequency to the one before it


relative frequency

percent compared to the total number in the sample, take frequency and divide by total, n, then move decimal two places to make a percent


scatterplot

the graphical representation and test for correlation


Correlation

a number from 1 to +1 that measures how strong a relationship is between two factors


distribution

the values taken on by the variable and the frequency of these values


mean

average sum of all values divided by total number of values


median

the middle value (or average of two middle values)


mode

most common value


outlier

data value that is much higher or lower than almost all other values (can distort mean, but not median or mode)


symmetric

left half is mirror image of right half


left skewed

values are more spread out on left side, distorts mean to the left


right skewed

values are more spread out on right side, distorts mean to the right


variation

describes how widely data values are spread out about the center of distribution


Peak

the mode on a graph


Bimodal

two equal peaks


5 # summary

min
max median lower quartile upper quartile 

Variance

Total Sum from Deviation column, divided by n1


Normal Distribution

BellShaped Curve


Standard Deviation

a measure of how far data values are spread around the mean of a data set
