• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/22

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

22 Cards in this Set

  • Front
  • Back

The degree to which a statistical model represents the data collected

The fit of the model

A hypothetical value that can be calculated for any data set (not necessarily actually observed in the data set)

The mean value

The deviance between the observed data and the model

The error in the model


(the bigger the number, the poorer the fit of the model)

Calculate the magnitude of deviance

Subtract the mean value from each of the observed values.


(For example, in a problem where the mean amount of friends for a lecturer is 2.6. And lecturer 1 had only 1 friend, so the difference is x1 - x = 1 - 2.6 = - 1.6. Our model OVERESTIMATES his popularity, it predicts he will have 2.6 friends but he only has 1)

A good measure of how well the mean represents the data (how well our model fits the data)

Sum of squared error

Variance

Sum of squared error (SS) / sample size minus 1 (n - 1)

Standard deviation

Variance gives an answer in united squared (because each error was squared in the calculation)



SO, the square root of the variance is needed. This is called the Standard Deviation - a measure of how well the mean represents the data.

Standard deviation: Size

Small standard deviations (relative to the value of the mean itself) indicate that data points are close to the mean



A large standard deviation (relative to the mean) indicates that the data points are distant from the mean

If we're looking at how well a model fits the data, then we generallly look at deviation from the model, we look at the sum of squared error

deviation = sum (observed - model)2

A graph of how many times each score occurs



(A graph plotting values of observations on the horizontal axis, with a bar showing how many times each value ocurred in the data set)

Frequency Distribution / Histogram

The score that occurs most frequently in the data set

The mode

The normal distribution

Frequency Distribution: Symmetry

Skewness.


Positively skewed - the frequent scores are clustered at the lower end and the tail points towards the higher / more positive scores


Negatively skewed - the frequent scores are clustered at the higher end of and the tail points towards the lower, more negative scores

Frequency Distribution: Pointyness

Kurtosis: the degree to which scores cluster in the tails of the distribution.



Platykurtic - many scores in the tails, and is therefore flat



Leptokurtic - thin in the tails, so look quite pointy

Standard Deviation & Shape of Distribution

As the standard deviation gets larger, the distribution gets wider

Probability distributions

Statisticians have identified several common distributions. For each one, they have worked out mathematical formulae that specify idealised versions of these distributions, called Probability Distributions. From one of these distributions, it is possible to calculate the probability of getting particular scores based on the frequencies with which a particular score occurs in a distribution with these common shapes.

Z Scores

Any data set (even if it has a mean of 567 and an SD of 52.98!) can be converted into a data set that has a mean of 0 and a standard deviation of 1

Z Scores: Equation

First, to centre the data at zero, we take each score and subtract it from the mean of all scores. Then, we divide the resulting score by the standard deviation to ensure all data have a standard deviation of 1



z = score - (mean of scores) / SD

95% of z scores lie between

- 1.96 and 1.96

99% of z scores lie between

- 2.58 and 2.58

99.9% of z scores lie between

- 3.29 and 3.29

How well a particular sample represents the population

Standard Error