• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/82

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

82 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)

Design

Plan how to obtain the data

Description

Summarize the data with graphs and numerical summaries

Inference

Use data from a random and representative sample to draw conclusions about the the population of interest

Parameter

Numerical summary of a population

Statistic

Numerical summary of a sample

Subjects

Persons, animals, or objects in our study/experiment

Variables

The characteristics that we measure on each subject

Population

All subjects of interest

Sample

Subjects for whom we have data

Random Sampling

Each member of population has the same chance of being included in the sample (representative of the population)

Categorical Variables

Summarize with counts and percentages


Graphs: bar charts and pie charts

Quantitative Variables

Takes on numerical values


Graphs: dotplots, histograms, stemplots, and box plots


Measures of center: mean, median, mode


Measures of spread: range, IQR, standard deviation

Discrete Quantitative

Take only a finite list of possible outcomes, such as a count (ex: year, absences)

Discrete Quantitative

Take only a finite list of possible outcomes, such as a count (ex: year, absences)

Continuous Quantitative

Has a an infinite list of possible values that form an interval (ex: exam scores)

Dotplot

Back (Definition)

Stemplot

Back (Definition)

Histogram

Back (Definition)

Boxplot

Back (Definition)

Bar Chart

Back (Definition)

Pie Chart

Back (Definition)

Histogram Shapes

Back (Definition)

Mean

The average of all the observations

Median

Observation “right in the middle” : M

Mode

Most frequently occurring value

Range

Range = maximum - minimum

Variance

-Average squared deviation from the mean: s^2

Standard Deviation

Square root of the variance: s

Empirical Rule

In any bell-shaped and symmetric distribution you will find approx:


-68% of the observations within one sdev of the mean


-95% of the observations within 2 sdev of the mean


-99.7% of the observations within 3 sdev of the mean

Quartiles

Divide the data set into four quarters

IQR

Measures the spread of the central 50% of the data


IQR = Q3 - Q1

IQR

Measures the spread of the central 50% of the data


IQR = Q3 - Q1

Five number summary

Minimum, Q1, median, Q3, maximum

Z-Score

Fill in….

Scatterplot

Back (Definition)

Explanatory Variable

X axis

Response Variable

Y axis

DOTS (Scatterplot)

Direction


Outliers


Trend


Strength

Correlation

-The direction and strength of the straight line in relationship between x and y.


-Represented by r


-r is always between 1 and -1 (no units)


-Interpretation: strong/weak, positive/negative


-Outliers can have strong effect of r

Regression Equation

y(hat) = a+bx

B (slope of regression line)

Average change in y for one unit change in x

A (slope of regression)

Y-intercept: expected value of y when x=0. Only interpret if x=0 makes sense and is close to values of x observed in data

Residuals

Prediction of errors for each observation


Residuals = y - y(hat)

Residuals

Prediction of errors for each observation


Residuals = y (observed y) - y(hat) (predicted y)

Least Squares Method

Finds the line that minimizes the sum of the squared residuals

R^2

R^2= (r)^2


Proportion of the variability in y that is explained by the regression on x

Cautions

Influential Outlier: points that have an x value far away from the rest


Correlation (or Association): does not imply causation


Extrapolation: extend the application to an unknown situation by assuming that existing trends will continue or similar methods will be applicable


Simpson’s Paradox: a lurking variable can reverse the association between two categorical variables in a contingency table

Contingency Tables

-Both explanatory and response variables are categorical


-Display counts (frequencies) on the table


-Compute % to determine association

Experiments

The researcher assigns subjects to certain experimental treatments

Experiments

The researcher assigns subjects to certain experimental treatments

Observational studies

Researcher does nothing to subjects but observe x and y

Experimental unit

Subjects involved in the experiment

Experimental unit

Subjects involved in the experiment

Treatments

Experimental conditions given to the subjects

Random Phenomenon

Distinct predictable pattern after many outcomes

Probability

Long-run relative frequency

Independent Trials

The outcome of one trial is not affected by the outcome of another

Sample Space

The set of all possible outcomes

Event

An outcome or group of outcomes, a subject of the sample space

Biased samples

Systematically favor certain outcomes, not representative of the population of interest

Margin of error

1/square root of n

Placebo

Dummy treatment

Placebo

Dummy treatment

Blind study

Subjects do not know if they receive treatment or placebo

Blind study

Subjects do not know if they receive treatment or placebo

Double blind study

Neither the subject or those in contact with the subject know who gets the treatment or placebo

Randomization

Use mechanical method to select subjects and assign them to treatments

Randomization

Use mechanical method to select subjects and assign them to treatments

Replication

Number of experimental units that get each treatment

Cross-sectional studies

Sample surveys that just want to take a snapshot of the population at the current time

Case-control studies

Retrospective studies (backward looking) in which we match each case (positive outcome) with a control (negative outcome) and then ask questions about the explanatory variable

Prospective studies

Forward looking and follow subjects into the future

Complement of an event

The rest of the sample place, written as A^c


P(A^c)=1-P(A)

Disjoint events A and B

P(A or B)= P(A) + P(B)

Conditional probability

P(A|B) = P(A and B) / P(B)

Independent events A and B

If two events are independent, knowledge about one event tells us nothing about the other event


Definition: P(A|B) = P(A)


Multiplication rule: P(A and B) = P (A) x P(B)

Discrete Random Variables

Finite number of possible values


Prob distribution: list, graph or formula with all possible values of X and their probabilities


Population mean: u=sum of xP(x)

Continuous Random Variables

-Infinite number of possible values


-Probabilities are areas under a density curve (smooth) with a total area of 1


-Assign probabilities to intervals, not individual values of X

Normal Probability Distributions

- Bell-shaped curves, indexed by their mean and standard deviation


-Follows empirical rule

Binomial distribution

-Each of n trials can have two possible outcomes: success or failure


-Probability of success for each trial is the same: p (independent events)


-Binomial Random Variable X counts the number of successes


Mean: u(mu)=np


Standard deviation: square root of np(1-p)

Binomial distribution

-Each of n trials can have two possible outcomes: success or failure


-Probability of success for each trial is the same: p (independent events)


-Binomial Random Variable X counts the number of successes


Mean: u(mu)=np


Standard deviation: square root of np(1-p)

P(at least one)

1-P(none)