Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Statistics

Statistics

by taydre, Mar. 2013

Subjects: Nonparametric tests, Chi-square, correlations, linear regression

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/62

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

62 Cards in this Set

Front
Back

	What is the difference between parametric and nonparametric tests?	Parametric tests require interval level measurement, normality, and equal variance
	Chi-Square - Research question	Do observed frequencies systematically differ from expected frequencies? If yes, significant.
	Chi-Square uses what level of measurement?	Non-interval. IV's involve discrete categories DV is in the form of frequencies, proportions, probabilities, or percentages
	One-way Chi-Square: null	observed distribution of frequencies = expected distribution of frequencies
	One-way Chi-Square: assumptions	each observation falls into only one cell observations independent expect frequency for each category is not less than 5 (df >2) or 10 (df = 1)
	Two-way Chi Square: purpose	Used when we have two IVs, determine whether the IVs are independent of one another
	Two-way Chi Square: null	variables are independent in populations
	Two-way Chi Square: assumptions	o Each observation falls into one cell o Observations independent o Expected frequency for each cell is not less than 5 (df > 2) or less than 10 (df = 1)
	Fisher’s Exact Test	Given the marginal values (holding constant) how unusual are values in table? Can compute probability (what % of the time) of producing variables in matrix.
	McNemar's Test	nominal data, repeated measures. 2x2 contingency table. Determines whether the proportion of “success” changes over time. Ignores diagonal, looks across cells that indicate change, one-way chi-square. Are row and column frequencies equal?
	Kappa	measure of agreement among judges. What proportion of time do judges agree? Takes into account frequency due to chance. Assumes mutually exclusive, categorical data (judges put people into categories) Diagonal = judges see applicants exactly the same way.
	Correlation: basic definition	Class of statistics, measure relationship between variables. All range from -1.0 to +1.0. Df = n-2
	Correlation: purpose	explain variance, predict score.
	Correlation vs. covariance	Covariance is the index of the tendency for two variables to covary together measured in original units. Difficult to interpret. Correlation is measured in standardized units so variables are on the same scale.
	Correlation: null hypothesis	ρ(xy) = 0 (correlation in population = 0)
	Non Nil Null	ρxy ≠ 0 When ρ assumed to be zero, have normal distribution. Correlation can’t go beyond one. Shift ρ, sampling distribution becomes skewed, non-normal.
	Fisher's Z	Covert r to z scores. Distributed as z, compare to critical z value. Use when: testing non nil null Comparing two observed correlations Averaging correlations
	When will correlation be maximum?	Correlation will be maximum when both x and y are normally distributed and range is not restricted.
	Correlation types: Pearson's r/Pearson's product moment correlation	2 continuous interval level variables, assume independent, normally distributed
	Correlation types: Point Biserial r	(true) dichotomous variable x interval variable
	Correlation types: Biserial r	artificial dichotomous variable x interval variable
	Correlation types: Phi coefficient	two truly dichotomous variables
	Correlation types: Spearman's rank order correlation/rho	two ordinal variables, compare ranks on each variable for each case, to the degree that the ranks are different, move away from 1.0.
	Correlation types: Tetrachoric correlations	two dichotomous variables, at least one artificial
	Correlation types: polychoric correlation	more than 2 levels, polytomous variables (i.e. likert scale), treat as ordinal level, but assume underlying distribution normal (measuring a latent interval variable)
	Correlation types: gamma	ordinal data, based on concept of concordant and discordant pairs
	Correlation types: Kendall's tau-b and tau-c	similar to gamma, but gives discordant pairs more weight
	Correlation types: Cramer's V	nominal variables with more than two levels
	Linear Regression: purpose	use relationship to predict one variable from the other minimize mistakes by finding "line of best fit"
	Linear Regression: equation of a line	y = a + bx a is constant (y-intercept B0) b is slope (regression weight, B1) Regress y onto x
	Linear Regression Beta Weight	β1, regression weight in standardized units
	Linear Regression Evaluating Output: R	correlation between x and y always positive
	Linear Regression Output: R^2	percent variance in y that can be explained given x
	Linear Regression Output: ANOVA	statistical test of model fit, is R different from zero?
	Linear Regression Output: t-tests	tests if regression weight significantly different from zero
	Linear Regression Output: standard error of estimate	magnitude of residuals/errors. Large = poor fit. If x and y unrelated, best guess of y is mean of y. On average, will be 1 SD off. As soon as relationship between x and y is not zero, SEy will decrease (estimate becomes more accurate). By using regression equation, how big are the mistakes you are now making? Compared to SDy, how much have you improved? • Function of correlation and number of cases • Larger r2 and larger N, smaller standard error
	Linear Regression Output: regression outliers	Misfit, compare observed and predicted y for each case, how close is estimate? Can remove extreme cases from model and identify problematic cases.
	Linear Regression Output: Adjusted R^2	sample not perfect representation of population, R always inflated, adjusted R^2 takes into account sampling error and amount of predictors, estimates R^2 in population. Always lower than R^2.
	Linear Regression Output: Sums of Squares	SS total is total variance in system to be explained SS regression is the variance explained by the regression model SS residual is whatever variance is left unexplained
	Linear Regression: assumptions	• Homoscedasticity • Linearity • Independence • Normality
	Homoscedasticity	when we run regression, assume standard error constant term up and down regression line, errors don’t vary up and down range of predictions (same accuracy in prediction) o Violation: at low levels of x, no variability in y, at higher levels, more variability in y
	Part-whole correlations (item-total correlations)	in measurement, frequently correlate item response with total test score. Raw correlation would be problematic. Total score is made up of the item in which you are correlating so you are correlating the item to itself. Overestimating the correlation because correlation with self is 1.0.
	Regression to the mean	if a variable is extreme on its first measurement, it will tend to be closer to the average on second measurement. More extreme scores on one variable connected to less extreme scores on another, seen as causal, it’s not. Characteristic associated with less than perfect data/correlation. Only time it would not happen is if correlation is perfect. It is a methodological artifact.
	Nonparametric Tests: Mann-Whitney U	Similar to independent samples t-test, treats dependent variable as ordinal. Can’t talk about means.
	MW-U: null	distribution of scores in the two populations from which samples were drawn are identical.
	MW-U: requirements	1 IV with two levels and case appears in only one group
	MW-U Rank sums test	Compare ranks of scores across 2 independent samples o Order from lowest to highest, add ranks together o If null true, rank sum of each group should be equal Ex. Group 1 = , Group 2 = 35; o U is defined as the number of Group 2 scores (because ranked higher) preceding each Group 1 score o What is probability associated with this? Based on all possible orderings. o Output: < 20 cases, report exact significance, >20 cases, asymp. Significance
	Wald-Wolfowitz test	looks for runs in the data. More runs = less significant, less = more likely to be extreme groups
	Wilcoxon T	paired (dependent) samples t-test (treats DV as ordinal)
	Wilcoxon T rank sums test	compute a difference score and rank these differences, evaluate rank sum of positive differences in comparison to rank sum of negative differences. If positive or negative change is equally likely, sums will be close. T is defined as the smaller of the two numbers
	Wilcoxon T requirements	one IV with two levels, case appears in both levels
	Wilcoxon T null	sum of positive ranks will equal sum of negative ranks.
	Kruskal-Wallis one-way ANOVA (KWANOVA)	Direct extension on Mann-Whitney U to cases with > 2 IV groups Compare rank sums across multiple groups Test statistic = H, distributed as chi-square
	KWANOVA null	all group rank sums are equal
	KWANOVA requirements	one IV with > 2 levels, cases in only one group, minimum number of cases in each group > 6
	Friedman's test	> 2 repeated measures, similar to repeated ANOVA, distributed as chi-square (H) Compare mean ranks, significant: mean rank across time periods is changing, NOT mean of data.
	Resampling	Interested in comparing two means drawn from populations that are known to be non-normal (shouldn’t be computing a t-test or a correlation) Test doesn’t exist or we don’t have confidence with necessary assumptions Need to compute our own distribution
	Sampling with replacement	probability of choosing always the same, two sample values independent (what we get one first sample doesn’t effect second sample), covariance = 0
	Sampling without replacement	two sample values aren’t independent, covariance > 0
	Permutation/randomization test	treat our data as the population, redistribute observed data without replacement to test hypothesis of no effect. Have two groups, compare after each redistribution, build distribution of a statistic Null: no difference between groups, drawn from same population
	Bootstrapping	sampling with replacement, what is variability within group, how stable is statistic? Helps establish confidence intervals.
	Monte Carlo	repeated sampling from population, trying to understand how sensitive statistical tests are, i.e. to departures from normality

Share This Flashcard Set

Set the Language

Statistics

Add to Folders

Upgrade to Cram Premium

Card Range To Study

62 Cards in this Set