Study your flashcards anywhere!
Download the official Cram app for free >
 Shuffle
Toggle OnToggle Off
 Alphabetize
Toggle OnToggle Off
 Front First
Toggle OnToggle Off
 Both Sides
Toggle OnToggle Off
 Read
Toggle OnToggle Off
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
27 Cards in this Set
 Front
 Back
14. Central Tendency

is a statistical measure that identifies a single score as a representative for an entire distribution or set of data.


14. Three measures of Central Tendency (CT)

Mean, Mode, Median


14. Mean=
1.The mean is the standard measure of central tendency in statistics. It is most frequently used. 2.The mean is not necessarily equal to any score in the data set 3.The mean is the most stable measure from sample to sample. 4.The mean is very influenced by Outliers  That is, the mean will be strongly influenced by the presence of extreme scores. 
•The average of a set of scores
•The most commonly used measure of CT •Notation •The mean of a population is symbolized as: •the mean of a sample is symbolized as: x (with a bar on top) •The mean = the sum of all the scores divided by the number of scores: X/N 

14. Median
•The median is the score that divides the distribution exactly in half  that is, have the scores are below the median and half are above the median. •It is the precise midpoint. •Notation: Mdn •The computation of the median depends on whether there are an odd number of observations or an even number of observations. It also depends upon whether there are duplicate observations (i.e., 1 2 2 2 3 3 4). 
1.If N is odd and there are no duplicates, then the median is the score that falls exactly in the middle of the scores (once you have ordered them).
2.If N is even and there are no duplicates, then the median is the average of the two middle scores. Again you need to order them first. 3.If there are duplications of scores, then you need to use the median formula as discussed in class. The median is not sensitive to outliers The median is the best measure of central tendency if the distribution is skewed. 

14. Mode=
The Mode is the least stable measure from sample to sample. 
• The mode is the most frequent score in a distribution.
•It is the "typical" value. •In a frequency graph you can immediately see what the mode is because it is the tallest value, or the score with the highest frequency. •Notation: Mo 

15. Range

Range=
• R = URL of highest score  LRL of lowest score • Problem – only determined by two scores o Not sensitive to distribution o Very sensitive to outliers Getting an average deviation • using absolute value • using the sum of squares  SS (formulas giving in class) 

15. Variance
• This is probably the most important statistical idea in this class 
Variance – mean squared deviation (s2 or 2)


15. Standard deviation = square root of the variance (s or )

Standard deviation =
“typical” distance of a set of scores from the mean. • Computing SS, variance and St dev – to be discussed in class. o Computational vs. definitional formula o Variance and St. dev can only be computed for interval and ratio data, not nominal or ordinal o Variance and St. dev are always greater or equal to zero. • Properties of the standard deviation 1.it describes how variable or spread out the scores are 2.it is the typical or approx. average distance from the mean 3.if it is small, then scores are clustered close to mean; if it is large, they are scattered far from mean 4.it is very influenced by extreme scores 

15. Measures of Variability

Variance,Range, Standard Deviation


16. Central Limit Theorem

•Central Limit Theorem states that for any population with mean = , and standard deviation = , the distribution of sample means from sample size = n will be approximately normally distributed with mean = and standard deviation = /n as n goes to infinity.
•Note: the CLT holds best if either 1) the original population has a normal distribution or 2) n>30. •What does this all mean? oNo matter what the shape of the original distribution, the distribution of sample means taken from that population will be normally distributed. o x bar = o x bar = /n 

Type I error
•type I errors are considered to be worse than Type II •The probability of a Type I error is equal to our alpha level 
owhen you reject the null hypothesis when it is actually true
oYou say the treatment has an effect, when in reality it doesn’t 

Type II error

oWhen you fail to reject the null hypothesis when it is actually false.
oYou say that there is no effect of the treatment, when in reality there is. 

Null hypothesis

o states that there is NO treatment effect. In other words, the new population is the same as the old.
o Ho: treatment group = original population o E.g., Ho: stimulated babies = 

Alternative Hypothesis

o states that the treatment DID have an effect. In other words, the two populations are different (nondirectional alternative hypothesis)
oH1: treatment group = original population oE.g., H1: stimulated babies = oThere are also two possible directional alternative hypotheses. H1: stimulated babies < H1: stimulated babies > oIn class, you should always use the nondirectional alt. hyp. 

Power (defintion)

the ability of the test to correctly reject the null hypothesis when the null hypothesis is false. It is the sensitivity of the test to find real differences when they exist.
o So power is related to Type II error – if the test is not “powerful”, you are more likely to make a Type II error. 

Power (4 factors)

1. size of the treatment  more extreme treatments make for more powerful tests
2. alphalevel – the bigger the alpha level, the more power 3. one vs. twotailed tests – one tailed tests are more powerful 4. sample size – the larger the sample size, the greater the power 

Between vs. Within subjects designs

•Between subjects – the participant is in one and only one level of the IV, so the data in one group is completely independent of the data in the other group
•Within subjects – the participant is in every level of the IV, so the scores from one level of the IV are related to the other level. •These two designs require different statistical procedures. Ch. 10 focuses on the between subjects situation. In Ch. 11, we will cover within subjects designs 

Assumptions of the independent groups ttest

1.Normality
2.Independence of observations 3.Homogeneity of variance  the two populations from which we have sampled have equal variances. •12 = 22 = 2 •so we can drop the subscript and simply call it 2 

Advantages for a within subjects design (compared to between subjects design.)

• Error variability is reduced so the power of the test is increased
• Can control for potentially confounding variables • Is more economical in terms of the number of participants needed • Note: The first two advantages also apply to a matched pairs design 

Carryover effects

occurs when being in one treatment causes changes in the second treatment.
o Three examples Learning Boredom Subject bias caused by guessing the hypothesis o Remedy for carryover effects – Counterbalancing (I will discuss this more later) 

Counterbalancing – is a way to remove the problem of carryover effects

• Carryover effects can be thought of as “order effects”, such that the order of administration has an effect on the DV. In effect, order becomes a confounding variable.
• With counterbalancing, half the participants get the treatment in one order, and half get the other order. o This removes the confounding variable of “order”, by spreading the effect evenly over the two conditions. o But, then order becomes a disturbance variable, adding extra unsystematic variability to the study. • A disturbance variable is one that can cause changes in the DV but does not vary systematically with the IV. o Disturbance variables make it more difficult to find a significant effect (incr. Type II error) o It is best to keep all disturbance variables constant when possible. • It is better to have a disturbance variable than a confounding variable. 

Assumptiuons of ANOVA

Assumptions of ANOVA – the assumptions are the same as for an independent groups ttest
• Independence of observations • Normality • Homogeneity of variance3. Homogeneity of variance  the two populations from which we have sampled have equal variances. 

Comparison of ttest and ztest

• So, far with hypothesis testing, we have been using ztests, for which we must know . However, most often we don’t know enough about the population and don’t know exactly what is.
• If we don’t know , we can estimate it using our sample data by computing s. • Recall: s = SS/df = SS/N • So, if we have an estimate of , then we can estimate x using sx. sx is called the estimated standard error of the means and can be used in place of x, when is not known. sx = s/n Note: the x should have a bar over it • Using the estimated standard error, we can compute t. But now, since is estimated it is not longer exact. t = (x – )/sx Note: the x’s should have bars over them. 

Characteristics of the tstatistic
• It is used to test hypotheses about when s is not known. • It is sort of a substitute for z, with the major difference being that t uses an estimate of the standard error in the denominator that is based on the sample data. 
• How well does t approximate z?
o This depends on the sample size – the larger the sample, the better s estimates the population. o So, the larger the df, the closer t will be to z. • What does the tdistribution look like compared to the zdistribution? o Is bellshaped o Has a mean of 0. o The shape depends on df. In general, the distribution is more variable than the zdistribution. The fewer the df, the flatter and more spread out the tdistribution is and the fatter the tails are. If there are many df (over 30 or 60 or so), then the tdistribution looks almost exactly like the standard normal distribution. If df=infinity, the two distributions are identical. 

• Why is the tdistribution more spread out then the zdistribution?

o In z – the only value that changes from sample to sample is the sample mean.
o In t – the sample mean also varies, but the estimated standard error does too, as it is based on sample data. o So, for t – there is extra variability in both the numerator and the denominator, leading to a more variable distribution overall. o As n gets larger, the estimate of the standard error becomes more accurate decreasing the variability in the denominator, making t more like z. 

How do u calculate the probability of obtaining a given score in a normal distribution?

• Write out the probability equation
• Draw a picture – draw a normal distribution, marking the mean, and shade in the area of the picture that corresponds to the question being asked. • Calculate the zscore(s) and look up the probability values. • We will go through some examples in class. 

2 ways tpo describe shape of distribution
oSkewness – the degree to which the distribution departs from being symmetrical Positively skewed – tail extends in the positive direction Negatively skewed—tail extends in the negative direction 
o Skewness – the degree to which the distribution departs from being symmetrical
Positively skewed – tail extends in the positive direction Negatively skewed—tail extends in the negative direction 