Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
131 Cards in this Set
- Front
- Back
Statistics
|
Methods of measuring variables and organizing and analyzing data.
|
|
2 Types of Statistics
|
Descriptive: Describe set of data
Inferential: make inferences about entire pop from sample data |
|
4 Scales of measurement
|
Nominal, ordinal, interval, ratio
|
|
Nominal Data
|
Unordered categories in which data may fall: gender, dx
|
|
Ordinal Data
|
Provides info about the ordering of categories, but no indication of how much more or less one category is in relation to the other - ranks, surveys
|
|
Interval Data
|
Numbers are scaled at equal distances, but scale as no absolute zero point. Addition and subtraction can be performed but mult and division cannot. Ex. IQ and Temp - you can't say twice as smart
|
|
Ratio Data
|
Identical to interval but has an absolute zero point and can be mult/div. Ex. money, time, distance, height, wt, freq of bx per hour
|
|
3 Types of Descriptive Statistics
|
1. Frequency Distributions
2. Measures of Central Tendency 3. Measure of variability |
|
Frequency Distributions
|
Summary of Data - Indicate the number of cases that fall w/in score/range.
Graphically displayed on table, polygon, bar graph (histogram) Cumulative Frequency (cf): total number of obs that fall or below category or score. |
|
Histogram (bar graph)
|
X-axis (abscissa): Scores/categories horiz
Y-axis (ordinate): Frequency or occurrence - vertical |
|
Normal Distribution
|
Bell shaped - half below/above mean. large pop. greatest number fall at/close to mean - ht, wt, IQ, SAT scores, musical ability
|
|
Skewed Distribution
|
Negatively skewed: Easy Test - above the mean
Positively skewed: Hard test - below mean Direction of tail: +/- |
|
Measures of Central Tendency
|
Single numbers that provide info on set of data
1. Measures of central tendency (mean, mode, median) 2. Measures of dispersion (variability) around the average |
|
Mean
|
Average of scores and most useful in advanced stats but sensitive to extreme values (skewed) and can be misleading.
Symbols: M or _ X |
|
Median
|
Middle value of data when ordered from low to high. Symbol Md. Odd (number in middle) and if Even (mean of 2 middle numbers).
Less sensitive to extreme values (skewed) |
|
Mode
|
Most frequent value, can be more than one (multimodal or bimodal)
|
|
Relationship bw mean, median, mode
|
Depend on distribution shape.
Normal: all equal Positively skewed: mean greater than median, median greater than mode. Negatively skewed: Mode is greater than median, and median great than mean Mean: is sensitive to extreme scores and will be pulled toward tail |
|
Measures of Variability: 3 Types
|
How spread out scores are.
1. Range 2. Variance 3. Standard deviation |
|
Range
|
Difference bw highest and lowest score 1-3 = 2 (range)
General description, limited by extreme scores, no info on distribution freq |
|
Variance
|
1. Measure of how scores disperse around the mean
2. Measure of variability of a distribution 3. Measure of variability that many statistical tests use in their formulas 4. It is equal to the square of the standard deviation |
|
Standard Deviation
|
Square root of the variance - expected deviation from the mean of a score chosen at random (+/-).
Higher sd more deviate from mean Normal distrib: calc % of score that fall in range, +/- cutoff score |
|
Transformed scores
|
Transform raw scores to another unit of measurement - to increase interpretability of raw scores and compare to scores in rest of distrib
z-scores, T-scores, stanines, percentile ranks |
|
Z-scores
|
Standard scores - raw scores stated in standard deviation terms.
Sub sample mean from score and divide by standard deviation. Measure of how many sd a raw score is from the mean |
|
Z-scores
|
Adv: Permit comparisons across diff measures and tests
Shape of distrib doesn't change - linear transformation |
|
T-scores
|
Based on 10pt intervals with a mean of 50. Every 10 pts is a sd
+1.0 = 60 |
|
Stanine
|
9 equal intervals 1-9. Mean of 5 and sd of 2
|
|
Percentile Ranks
|
% of indiv scoring below you
Flat distrib: Each rank has same number of scores Change shape of distrib: Nonlinear transformation |
|
The Standard Deviation Curve
|
1) normal distrib 68% z-score fall bw -1 :+1
2) Normal distrib 95 z = -2: +2 3) Normal distrib z-score of +1 = percentile rank of 84 and cutoff point for top 16% and -1 is PR 16% and cutoff point for bottom 4) z = +2 = 98%PR and cutoff top 2% If distrib normal you can know its mean and sd, and how many people in a range |
|
Percentile Rank and raw scores in normal distrib
|
Greater range of PR in the middle of distrib than at either extreme - less people to jump over.
|
|
Z-score Formula
|
Z = X (raw score) - M (Mean) / sd (standard deviation)
|
|
IQ: Mean: 100 and SD 15
|
680 or 68% will obtain scores of 85-115
950 or 95% 70-130 |
|
Test mean of 60 and sd of 5. Select top 150 of 1000. Where is the cutoff score?
|
Top 15%. Remember 16% is +1. Convert raw to z-score of +1. (60 + 5(1sd) = 65)
|
|
Inferential Statistics
|
Allow researchers to draw conclusions about pop based on results from samples
|
|
Sampling Error
|
The diff bw the sample value (statistic) and the pop value (parameter)
|
|
Error of the mean (sampling error)
|
Diff bw statistic mean and parameter mean
|
|
Standard Error of the Mean
|
Extent to which a sample mean can be expected to deviate from the pop mean
SE mean = sd / √N (size of sample) SE = 10 / √25 = 10/5 = 2 (sample deviate 2 pts higher or lower from pop mean) Error becomes smaller with larger sample bc closer to pop size Inverse relat: Sample size increases Standard error decreases and vice versa |
|
Statistical Hypothesis Testing
|
Must change question into a statistical hypothesis
|
|
Null v. Alternative Hypothesis
|
Test of significance there are 2 competing hyp.
Null: no diff, means are equal, iv has no effect on dv Alternative: Experimental hyp, iv effects dv, means not equal. When one is reject the other is accepted (vice versa) |
|
Null Hypothesis in Population Parameters
|
Ho: μ1 = μ2
|
|
Alternative Hypothesis in Population Parameteres
|
H1 = μ1 ≠ μ2
|
|
One tailed v. two tailed hypothesis
|
Alternative Hyp:
One: Goes in one direction Increase or decrease Two: Can go in either direction |
|
Statistical decision making
|
When using test of signif there are 4 outcomes: diff bw mean exist, does not exist, incorrect about either
|
|
4 Types of outcomes in test of significance
|
1. Retain a true null - no effect/correct
2. Reject false null - iv effect/correct -goal - Probability of this decision is called Power 3. Type I error 4. Type II error |
|
Type 1 error and Alpha Level
|
Null rejected when true.
Probability of making a Type 1 error is Alpha Level (α or p) - P-value level of significance Alpha is set prior (.01 or .05): there is a 5% chance that null is true - reject null |
|
Type II (Beta) Error and Power
|
Failure to reject null when its false.
Power: Probability of not making Type II - how powerful will the stat test detect difference of means - Not known |
|
Factors that affect power
|
1. Sample size: larger more power
2. Alpha: Level increases, power increases 3. Directional and non-directional statistical tests: One tailed more powerful than 2 4. Magnitude of the population difference: Greater diff more power |
|
Parametric Tests
|
Interval and ration data: T-test and ANOVA. Assumptions:
1. Normal Distribution 2. Homogeneity of Variance (equal dispersion of scores around the mean) 3. Independence of Obs: scores w/in group not affected by each other If not met, test can lead to misleading results. Robust: Mild violation of 1, 2. |
|
Nonparametric Tests
|
Ordinal and nominal scale: Chi-square/Mann-Whitney U
Don;t make assumptions/distrib free Less Powerful. Can be interval/ratio that didn't meet assumptions, convert to data ranks One assump, random selection |
|
The deviation of a sample statistic from a parameter of the pop from which the sample was drawn
|
Sampling error
|
|
The probability of rejecting a true null hypothesis
|
Alpha
|
|
The probability of retaining a false null hypothesis
|
Beta
|
|
The probability of rejecting a false null hypothesis
|
Power
|
|
Standard error of the mean is
|
Directly proportional to the sd (both increase together) and inversely to the sample size (larger sample, smaller error)
|
|
When a statistical test lacks power, this means that
|
the probability of obtaining stat signif will be low - lacks power high prob of making type II (false null retained). Will not yield stat signif (an effect) when it should
|
|
Alpha
|
probability of rejecting null when its true
|
|
Low Power
|
Unlikely to detect an effect of an IV, even if one is present, null will be retained
|
|
Steps in using stat tests
|
1. Null and alt hyp are stated
2. Data collected 3. Data analyzed w/ test: formula, stat value (t, F), test depends on data, number of iv/dv/groups 4. Stat value compared to critical value. Depends on 1. alpha and 2. Degrees of Freedom. Exceeds critical then null rejected |
|
Parametric Tests
|
t-test
One-way ANOVA factorial ANOVA MANOVA |
|
t-Test
|
Test hyp about 2 diff means (can't be used if more than 2.
3 Types: One sample (degree of freedom N-1), independent sample (groups not related, N-2), correlated sample (related, matched or pre/post N-1 N=number of pairs of scores). Two means are Stat diff: null rejected |
|
ANOVA
|
2+ groups compared (2 t-test simpler).
Q: What is the prob that means are from same pop? Yields: F ratio F stat signif, means are signif diff and null rejected Does not indicate which groups differ |
|
Derivation of the F-Ratio
|
F: Comparison bw two estimates of variance:
1. b/w group: Degree groups differ from one another, bw the means 2. w/in groups (error): degree subjects differ (diff bw scores). |
|
Derivation of the F ratio
|
Null is true if two estimates of variance are the same.
If IV has an effect b/w group variance is signif greater than w/in (groups means exceed diff due to erro/indiv diff). |
|
Steps in calculating a ratio
|
1. Sum of Squares
2. Degrees of Freedom 3. Mean Square 4. Calculate F ratio 5. Compare to critical value 6. Retain/reject null 7. Post hoc tests |
|
Sum of squares
|
Measure of variability of a set of data
SSb and SSw Formula is not nec for exam |
|
Degrees of freedom
|
Dfb = K-1 (number of groups)
Dfw = N-k (number of subjects) |
|
Mean Square
|
Estimate bw/w/in group variance
MSB: SSb/Dfb MSW: SSw/Dfw |
|
F-ratio formula
|
F = MSB/MSW
|
|
Last step in ANOVA
|
Compare F ration to critical value at signif level of .05 or .01. If F is higher than critical than null is rejected.
|
|
Post Hoc Tests for ANOVA
|
Done if F is signif. to determine which means are diff.
Pairwise: bw 2 means Complex: combined means More post hoc comparisons more likely for Type I error or experiment wise error |
|
Types of Post Hoc tests
|
1. Scheffe: most conservative, greatest protection from Type I, increases Type II (missing a true effect)
2. Tukey Honestly Significant Difference: protection from Type I when only using pairwise |
|
Pairwise or complex comparisons "a priori"
|
use these tests instead of ANOVA when expected differences in means are stated in advance
|
|
Other forms of ANOVA
|
One-way ANOVA: One IV and more than 2 independent groups
Factorial ANOVA: Two or more IV MANOVA: 2 of more DV |
|
Other forms of ANOVA
|
One-way ANOVA for repeated measures: all subj receives all levels of IV, or for more than 2 matched groups.
ANCOVA: analysis of covariance: adjust DV to control effects of extraneous variables |
|
Factorial ANOVA
|
More than 1 IV, 2 IV (two way), 3 IV (three way)
Assess main effect: one iv by itself Assess interaction effect: effect of iv at different levels of other iv |
|
Factorial ANOVA
|
Assess main effect by exam diff bw marginal means
Assess interaction effect by exam cell means When there is an interaction effect interpret the main effect with caution bc may not generalize to all levels of IV No interaction number either across or down will move in same direction or will cross lines |
|
Factorial F-ratio
|
2-way: 3 F ratios
3-way: 7 F ratios |
|
Variations of the Factorial ANOVA
|
Factorial ANOVA for repeated measures/or matched groups: all levels of IV given to a single group.
Mixed ANOVA or split-plot ANOVA: One bw subj IV and one repeated meas variable (w/in subj) |
|
Multivariate Analysis of Variance (MANOVA)
|
2+ DV and 1+ IV. - Reduces Type I Error. Can use multiple ANOVA or factorial ANOVA.
|
|
Chi-Square Test
|
Nonparametric - nominal data, survey questions - used when frequencies or number of subjects within each category are given
|
|
Chi-square statistic
|
x2: indicates whether obtained frequencies in a set of categories differ from null.
df = C-1 (single sample) C= number of categories df - (C-1)(R-1) R = number of levels of 2nd variable (multiple sample) |
|
Chi-square Caution
|
Misleading results:
1. All obs must be indep 2. Each obs can be classifiable into only on category 3. Precentages of obs w/in categories can't be compared |
|
Calculating expected frequencies
|
Simple sample: Dividing the total number of subj by the number of cells
Multiple sample: fe = (column total) X (row total) / total N |
|
Mann-Whitney U Test
|
Rank ordered data w/ 2 indep groups
1. Rank ordered 2. Starts w/ interval/ration but assump not met Sub for: t-test for indep samples |
|
Wilcoxon Matched Pairs Test
|
When 2 correlated groups are being compared using rank ordered data
Sub for: t-test for correlated samples |
|
Kruskal-Wallis Test
|
More than 2 groups compared - ANOVA for ranked
Sub for: one-way ANOVA |
|
Correlation and the Correlation Coefficient
|
Correl: relat bw 2 or more variables
Coeff: number that ranges bw -1.00 and +1.00. Tells you 2 things about relat: Strength (absolute value 1 = perfect, 0 no relat) Direction (pos same direction and neg inverse direction) |
|
Scattergram
|
Graph with X horiz(abscissa) and Y vert (ordinate)
|
|
Correlation and causality
|
HIgh correl does not mean they have a causal relat but a causal relat does mean they are correl
|
|
Pearson r
|
Pearson Product moment
Most used Measure for ratio and interval Assume relationship is linear (not curvilinear) and homoscedasticity (dispertion of scores are equal in scattergram v. hetero non uniform scatter) Will be highest when full range of scores are used. |
|
Interpretation of Pearson r
|
Coefficient of determination: square correl coeff indicate percentage of variability in one meas that is accounted for by the other meas
|
|
Point Biserial and Biserial Coeff
|
Point-biserial: one continuous and one dichotomous (income and gender) - rpb
Biserial coeff: 2 continous w/ one made dichotomous (exam scores high/low and income) rb |
|
Phi and Tetrachoric Coeff
|
Phi: 2 dichotomous φ
Tetra: 2 artificial dichotomous rt |
|
Contingency
|
2 nominally scaled items with each having more than 2 categories (fathers and son's eye color)
|
|
Spearman's Rho
|
Rank order correl coeff, ordinally ranked (rank perf on 2 tests, 2 tests correl) - rs
|
|
Eta
|
nonlinear relationship, pattern is U or inverted -U
η |
|
Regression
|
If 2 variables are correl estimate the value of one variable (criterion) based on the value of the other (predictor).
Regression equation plug predictor score in equation Error is expected bc correl is not perfect (+/- 1) |
|
Regression Assumptions
|
Linear relat - line of best fit or regression line: line that passes through most dots in scattergram - higher correl line closer to dots better predictor
Regression line determined by Least squares criterion: least amount of error |
|
Regression Assumptions
|
Error score (diff bw predicted and actually criterion score) normally distrib w/ mean of 0, relat bw criterion is homoscedastic
|
|
Regression as a substitute for ANOVA
|
One-way anova.
Dummy codes used in regression equation |
|
Multiple Correlation and Multiple Regression
|
Mult correl: Mult R: The relat bw 2 or more predictor and 1 criterion - higher the mult R, stronger relat.
Mult regres: Mult predict to estimate scores on 1 criterion |
|
Multiple Correlation and Multiple Regression
|
1. mult R (predictive power) is highest when predictors have high correl with criterion and low correl with each other (no overlap/multicollinearity and no longer provides new info)
2. mult R never lower than the highest simple correl, adding predictors can decrease mult R 3. mult R can never be negative 4. R squared "coeff of mult deter" proportion of variamce in criterion accounted by combo of predictors |
|
Stepwise mult regression
|
Used if you have a large number of predictors and want to use smaller subset bc less costly, avoid multicollinearity, increase predictive power
|
|
2 Types of stepwise mult regres
|
Forward: start with one and add until no change is seen
Backward: remove predictors stop when signif decrease mult R |
|
Canonical Correlation
|
Mult criterion and mult predictors
|
|
Discriminant Function Analysis
|
Scores on 2 or more variables are combined to determine whether they can be used to predict which criterion group a person will belong to
Assumpt: multivariate normal distrib , homogeniety of variance and covariance |
|
Differential Validity
|
Each predictor has a diff correl w/ each criterion
IQ test has low diff valid and likely correl w/ many criterion measures, not useful in placing indiv into groups |
|
Logistic Regression
|
Make predictions about which group a person belongs to but does not have assumpt. Choose over discrim funct analy
Data can be continuous (ratio and interval) and categorical (nominal/rank) primarily used with dichotomous variables (2 groups) Predicted value is bw 0 and 1 Polytomous logistic regression |
|
Multiple Cutoff
|
ID diff cutoff scores on a series of predictors. Must score at or above, if below failed
|
|
Partial Correlation
|
Partialling out effect of 3rd variable (suppressor variable) that contributes to relationship
Zero order correl: ignoring all possible variables that could contribute to relationship |
|
Structural Equation Modeling
|
Variety of tech based on correl bw mult variables. Assum: linear
1. Specify causal model involving many diff variables - depicted on path diagram 2. Conduct stat analy: correl bw all pairs 3. Interp results: consist w/ model, best fit data 2. |
|
Path Analysis v. LISREL
|
Path: One way causal, observed variables
LISREL: One/Two-way causal, observed/latent variables |
|
Trend Analysis
|
Test effect of repeated measure design of quantitative variables/groups, trend of change in (break points) Y over time - if signif
Linear - none Quadratic - one Cubic - 2 Quartic - 3 |
|
Determine variability in coeff
|
it is explained by variability in another, one squares the coeff
|
|
Theoretical Sampling Distribution
|
Pop distrib: Using every score in pop
Sample distrib: Using sample set of scores, less variable bc not using full range Sampling distrib: Using all possible sample values - sampling w/ replacement: obtain sample, record, return, obtain new sample. |
|
Central Limit Theorum
|
1. As sample size increases the shape of the sampling distrib of means approaches normal shape, even if scores not norm distrib
2. Mean of sampling distrib is equal to pop mean, less variab than pop SD of a distrib can tell how much a score will deviate from pop mean |
|
Robustness of a Statistical Test
|
Robust: rate of false rejection of null (type I error rate) is not increased by violation of assumpt (normal distrib and homo of variance) - parametric tests
|
|
Time-series Analysis
|
One DV is meas mult times before and after a treatment.
Independence of obs is violated for use of parametric tests (t-test) bc means across meas will be related. The correl at given lags is autocorrel |
|
Bayes Theorum
|
Formula for obtaining special type of conditional probability (with additional data - base rates).
Ex. probability of 85 y/o w/o Alz to test pos for Alz 0 - use base rate |
|
Meta-Analysis
|
Analyzing a group of indep studies with a common concept
Results of study used as "scores" Statistic: Effect Size (IV effect) - positive if tx is good Adv: overall effect size compared to conflicting results Disadv: bias is selection of studies, ignore interaction (loss of info) |
|
In a normal distrib of scores, a T-score of 60 is approx equal to a percentile rank of
|
84
T is a standard score w/ mean of 50 and sd of 10. 60 = 1sd and 1sd = 84 |
|
How many people would score bw 400 and 600 on a test w/ a mean of 500 and an sd of 100 (n=1000)
|
680
Convert to sd (z-score). 400 = -1 and 600 = +1. 68% fall into -1 and +1. |
|
Which scores distributions will likely have the least variability
|
A distrib of sample of means from the pop
|
|
Non parametric tests are used when
|
shape of distrib is unknown
|
|
In a oneway ANOVA, the null hyp is
|
Pop means are equal
|
|
N=400, mean is 50 and sd is 10, what is the standard error of the mean
|
.50
Formula for standard error of mean is sd divided by the square root of sample size. |
|
The number of cases falling bw a percentile rank of 11 and 20 will be _____ the number of cases falling bw 41 and 50.
|
Same as. Distrib of ranks is flat, will be equal intervals
|
|
One group tested pre and post treatment would use what stat test, df (N=40)?
|
T-test for correlated samples same group.
39 (N-1) |
|
In determining whether shoppers equally likely to use east, north, south, west entrances, which stat test, amount in each cell (N=100)
|
Chi-square - freq of obs w/in categories
25 - equal frequency |
|
Assoc bw intell and happiness, uses mult meas, what stat analy
|
Canonical correl - correl mult predictors w/ mult criterion meas
|
|
Robust
|
If assumptions not met, remain robust as long as groups' sample sizes are equal
|
|
T-score of 70 percentile rank
|
98 - 2 sd above mean of 50
|