Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
62 Cards in this Set
- Front
- Back
What is the difference between parametric and nonparametric tests?
|
Parametric tests require interval level measurement, normality, and equal variance
|
|
Chi-Square - Research question
|
Do observed frequencies systematically differ from expected frequencies? If yes, significant.
|
|
Chi-Square uses what level of measurement?
|
Non-interval.
IV's involve discrete categories DV is in the form of frequencies, proportions, probabilities, or percentages |
|
One-way Chi-Square: null
|
observed distribution of frequencies = expected distribution of frequencies
|
|
One-way Chi-Square: assumptions
|
each observation falls into only one cell
observations independent expect frequency for each category is not less than 5 (df >2) or 10 (df = 1) |
|
Two-way Chi Square: purpose
|
Used when we have two IVs, determine whether the IVs are independent of one another
|
|
Two-way Chi Square: null
|
variables are independent in populations
|
|
Two-way Chi Square: assumptions
|
o Each observation falls into one cell
o Observations independent o Expected frequency for each cell is not less than 5 (df > 2) or less than 10 (df = 1) |
|
Fisher’s Exact Test
|
Given the marginal values (holding constant) how unusual are values in table? Can compute probability (what % of the time) of producing variables in matrix.
|
|
McNemar's Test
|
nominal data, repeated measures. 2x2 contingency table. Determines whether the proportion of “success” changes over time. Ignores diagonal, looks across cells that indicate change, one-way chi-square. Are row and column frequencies equal?
|
|
Kappa
|
measure of agreement among judges. What proportion of time do judges agree? Takes into account frequency due to chance. Assumes mutually exclusive, categorical data (judges put people into categories) Diagonal = judges see applicants exactly the same way.
|
|
Correlation: basic definition
|
Class of statistics, measure relationship between variables. All range from -1.0 to +1.0. Df = n-2
|
|
Correlation: purpose
|
explain variance, predict score.
|
|
Correlation vs. covariance
|
Covariance is the index of the tendency for two variables to covary together measured in original units. Difficult to interpret.
Correlation is measured in standardized units so variables are on the same scale. |
|
Correlation: null hypothesis
|
ρ(xy) = 0 (correlation in population = 0)
|
|
Non Nil Null
|
ρxy ≠ 0
When ρ assumed to be zero, have normal distribution. Correlation can’t go beyond one. Shift ρ, sampling distribution becomes skewed, non-normal. |
|
Fisher's Z
|
Covert r to z scores. Distributed as z, compare to critical z value.
Use when: testing non nil null Comparing two observed correlations Averaging correlations |
|
When will correlation be maximum?
|
Correlation will be maximum when both x and y are normally distributed and range is not restricted.
|
|
Correlation types: Pearson's r/Pearson's product moment correlation
|
2 continuous interval level variables, assume independent, normally distributed
|
|
Correlation types: Point Biserial r
|
(true) dichotomous variable x interval variable
|
|
Correlation types: Biserial r
|
artificial dichotomous variable x interval variable
|
|
Correlation types: Phi coefficient
|
two truly dichotomous variables
|
|
Correlation types: Spearman's rank order correlation/rho
|
two ordinal variables, compare ranks on each variable for each case, to the degree that the ranks are different, move away from 1.0.
|
|
Correlation types: Tetrachoric correlations
|
two dichotomous variables, at least one artificial
|
|
Correlation types: polychoric correlation
|
more than 2 levels, polytomous variables (i.e. likert scale), treat as ordinal level, but assume underlying distribution normal (measuring a latent interval variable)
|
|
Correlation types: gamma
|
ordinal data, based on concept of concordant and discordant pairs
|
|
Correlation types: Kendall's tau-b and tau-c
|
similar to gamma, but gives discordant pairs more weight
|
|
Correlation types: Cramer's V
|
nominal variables with more than two levels
|
|
Linear Regression: purpose
|
use relationship to predict one variable from the other
minimize mistakes by finding "line of best fit" |
|
Linear Regression: equation of a line
|
y = a + bx
a is constant (y-intercept B0) b is slope (regression weight, B1) Regress y onto x |
|
Linear Regression
Beta Weight |
β1, regression weight in standardized units
|
|
Linear Regression
Evaluating Output: R |
correlation between x and y
always positive |
|
Linear Regression
Output: R^2 |
percent variance in y that can be explained given x
|
|
Linear Regression
Output: ANOVA |
statistical test of model fit, is R different from zero?
|
|
Linear Regression
Output: t-tests |
tests if regression weight significantly different from zero
|
|
Linear Regression
Output: standard error of estimate |
magnitude of residuals/errors. Large = poor fit. If x and y unrelated, best guess of y is mean of y. On average, will be 1 SD off. As soon as relationship between x and y is not zero, SEy will decrease (estimate becomes more accurate). By using regression equation, how big are the mistakes you are now making? Compared to SDy, how much have you improved?
• Function of correlation and number of cases • Larger r2 and larger N, smaller standard error |
|
Linear Regression
Output: regression outliers |
Misfit, compare observed and predicted y for each case, how close is estimate? Can remove extreme cases from model and identify problematic cases.
|
|
Linear Regression
Output: Adjusted R^2 |
sample not perfect representation of population, R always inflated, adjusted R^2 takes into account sampling error and amount of predictors, estimates R^2 in population. Always lower than R^2.
|
|
Linear Regression
Output: Sums of Squares |
SS total is total variance in system to be explained
SS regression is the variance explained by the regression model SS residual is whatever variance is left unexplained |
|
Linear Regression: assumptions
|
• Homoscedasticity
• Linearity • Independence • Normality |
|
Homoscedasticity
|
when we run regression, assume standard error constant term up and down regression line, errors don’t vary up and down range of predictions (same accuracy in prediction)
o Violation: at low levels of x, no variability in y, at higher levels, more variability in y |
|
Part-whole correlations (item-total correlations)
|
in measurement, frequently correlate item response with total test score. Raw correlation would be problematic. Total score is made up of the item in which you are correlating so you are correlating the item to itself. Overestimating the correlation because correlation with self is 1.0.
|
|
Regression to the mean
|
if a variable is extreme on its first measurement, it will tend to be closer to the average on second measurement. More extreme scores on one variable connected to less extreme scores on another, seen as causal, it’s not. Characteristic associated with less than perfect data/correlation. Only time it would not happen is if correlation is perfect. It is a methodological artifact.
|
|
Nonparametric Tests: Mann-Whitney U
|
Similar to independent samples t-test, treats dependent variable as ordinal. Can’t talk about means.
|
|
MW-U: null
|
distribution of scores in the two populations from which samples were drawn are identical.
|
|
MW-U: requirements
|
1 IV with two levels and case appears in only one group
|
|
MW-U Rank sums test
|
Compare ranks of scores across 2 independent samples
o Order from lowest to highest, add ranks together o If null true, rank sum of each group should be equal Ex. Group 1 = , Group 2 = 35; o U is defined as the number of Group 2 scores (because ranked higher) preceding each Group 1 score o What is probability associated with this? Based on all possible orderings. o Output: < 20 cases, report exact significance, >20 cases, asymp. Significance |
|
Wald-Wolfowitz test
|
looks for runs in the data. More runs = less significant, less = more likely to be extreme groups
|
|
Wilcoxon T
|
paired (dependent) samples t-test (treats DV as ordinal)
|
|
Wilcoxon T rank sums test
|
compute a difference score and rank these differences, evaluate rank sum of positive differences in comparison to rank sum of negative differences. If positive or negative change is equally likely, sums will be close. T is defined as the smaller of the two numbers
|
|
Wilcoxon T requirements
|
one IV with two levels, case appears in both levels
|
|
Wilcoxon T null
|
sum of positive ranks will equal sum of negative ranks.
|
|
Kruskal-Wallis one-way ANOVA (KWANOVA)
|
Direct extension on Mann-Whitney U to cases with > 2 IV groups
Compare rank sums across multiple groups Test statistic = H, distributed as chi-square |
|
KWANOVA null
|
all group rank sums are equal
|
|
KWANOVA requirements
|
one IV with > 2 levels, cases in only one group, minimum number of cases in each group > 6
|
|
Friedman's test
|
> 2 repeated measures, similar to repeated ANOVA, distributed as chi-square (H)
Compare mean ranks, significant: mean rank across time periods is changing, NOT mean of data. |
|
Resampling
|
Interested in comparing two means drawn from populations that are known to be non-normal (shouldn’t be computing a t-test or a correlation)
Test doesn’t exist or we don’t have confidence with necessary assumptions Need to compute our own distribution |
|
Sampling with replacement
|
probability of choosing always the same, two sample values independent (what we get one first sample doesn’t effect second sample), covariance = 0
|
|
Sampling without replacement
|
two sample values aren’t independent, covariance > 0
|
|
Permutation/randomization test
|
treat our data as the population, redistribute observed data without replacement to test hypothesis of no effect. Have two groups, compare after each redistribution, build distribution of a statistic
Null: no difference between groups, drawn from same population |
|
Bootstrapping
|
sampling with replacement, what is variability within group, how stable is statistic? Helps establish confidence intervals.
|
|
Monte Carlo
|
repeated sampling from population, trying to understand how sensitive statistical tests are, i.e. to departures from normality
|