• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/30

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

30 Cards in this Set

  • Front
  • Back

First things you should ask yourself before solving a statistics problem...

1. Who - the individuals, how many.


2. What - the variables, exact definitions of them, and the unit of measurement.


3. When


4. Where


5. Why - what purpose does the data have?

Categorical variable

Places an individual into one of several groups, or categories.

Quantitative variable

Takes numerical values for which arithmetic operations such as adding, and averaging make sense. (usually recorded with a unit of measurement)

Distribution

The distribution of a variable tells us what values it takes and how often it takes these values

Distribution of a Categorical Variable

Lists the categories and gives either the count or the percent of individuals who fall into each category.

Variablility

Spread

Outlier

Falls outside the overall pattern

One way to describe the center of a distribution is by its...

Midpoint (cross off smallest to largest)

Skewed Right

Look at notes

Skewed Left

Look at notes

Time plot

A time plot of a variable plots each observation against the time at which it was measured.




*Always put time on the horizontal scale of your plot and the variable you are measuring on the vertical scale.

Trend

Overall pattern on a time plot, and is a long-term upward or downward movement over time.

Cross-sectional data

-A histogram displays

Most common measure of center

Mean

Mean

To find the mean of a set of observations, add their values and divide by the number of observations.

Median

The formal version of the midpoint (half the observations are smaller than, and the other half are larger than)




-arrange from smallest to largest (if odd, the number is in the very middle)

Quartiles

The middle half of the data

First quartile

First quarter 25%

Second quartile

50%

Third quartile

75%

To calculate the quartiles

1. Arrange the observations in increasing order and locate the median, M, in the ordered list of observations.




2. The first Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median.




3. The third Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median.

Five number summary

Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest.

Box Plot

A box plot is a graph of the five-number summary.




- a central box spans the quartiles Q1 and Q3


- a line in the box marks the median M


- lines extend from the box out to the smallest, and largest observations.




(best used for side by side comparison)


*Outliers are marked as dots

the Interquartile Range (IQR)

IQR= Q3 - Q1




*rule of thumb for identifying outliers

To identify outliers..


(the 1.5 x IQR rule)

Q1- (1.5 x IQR) = X


If it falls more than


1.5 x IQR above the third quartile or below the first quartile then it is a suspected outlier.




Q3 + (1.5 x IQR) = Y

Standard deviation (s) and Variance (s^2)



or more compactly





Standard Deviation (s) is the square root of the variance (s^2)

Degrees of freedom

The number n-1 is called the degrees of freedom of the variance of standard deviation.




The number of values in the final calculation of a statistic that are free to vary. The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it

S = 0

NO variability

So far, we have a choice between two descriptions of the center and variability of a distribution....

The five-number summary




or




The mean (x-bar), and the standard deviation (s)


*These are both sensitive to extreme observations.