Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
46 Cards in this Set
- Front
- Back
- 3rd side (hint)
When assesing a test's split half reliability, what effect will incrasing the heterogenity of the sample (S's) have?
|
Increase the reliability
|
|
|
What is eta squared
|
A measure of effect size describing the amount of variablity in the criterion accounted for by the predictor
|
|
|
What is Cohen's d
|
Cohen's d is a measure of effect size when comparing the mean difference between two groups
|
|
|
ANOVA:regression as Cohen's d:_________
|
eta squared
|
|
|
Professor Sharp argues against raising the legal drinking age from 18 to 21 on the ground that doing so will only encourage 18-, 19-, and 20-year olds to drink. Apparently, Sharp is familiar with which of the following theories:
individuation psychological reactance inoculation expectancy-value |
reactance
|
|
|
How do you calculate a CI for a S's score on a test
|
to calculate CI you need
S's score SEM of the test level of confidence desired e.g. for a test w/ SEM of 5, if a S scored 105 and you wnated a 68% CI: 68% CI = within 1 SEM 105 +/- 5 68% CI = 100-110 for a 95 % CI (2 SEM) 105 +/- (2x5) 105 +/- 10 95 % CI = 95-115 99% CI (3 SEM) 105 +/- (3x5) 105 +/- 15 99% CI = 90-120 |
|
|
What is the formuula for SEM?
|
sigma/sqrt N
|
|
|
What formula is
sigma [1-sqrt(1-r2xy) where sigma=sd of criterion (Y) scores and rxy= correlation coefficient (an estimate of criterion validity) |
Standard Error of Estimate
It is used to create CI's for estimates of performance on a criterion (Y variable) derived from a regression equation involving the predictor (X variable) score CI = Yhat +/- SEe(# of SEe from mean assoc. w/ desired CI) 68% CI = 1 SEE 95% CI = 2 SEE 99% CI = 3 SEE |
|
|
What is criterion related validity
|
Criterion Related Validity is the extent to which a predictor (X) reliably estimates performance on a criterion (Y) using a regression equation
|
|
|
What is construct validity
How is it assessed? |
Construct Validity refers to the extent to which a predictor measures the concept it is deisned to.
Assess it using facor analysis or multitrait/multimethod analysis |
|
|
What are the 3 types of Construct validity?
|
1. Discriminant validity
2. Convergent validity Assess it using 1. Factor Analysis 2. Multitrait Multimethod Matrix |
Constuct Validity = DC FM
Construction of the Antena towers on Wisconsin DC recpeption of FM radio stations |
|
What are the 2 types of Criterion Validity
How do you assess them |
predicitive and concurrent
Predictive validity is the assessed by calculating the Standard Error of Estimate (SEest) SEest=SDy[sqrt(1-r2xy)] where SDy = SD of criterion scores (Y scores) and rxy2=validity coefficient (correlational coefficient indicating how much of the variance in Y is predicted by variability in X) |
Crtierion Validity is PC
|
|
Reliability of a test= sum of what two things
|
Communality + specificity
where communality= % of variance in DV (Y) explained by all of the IV (X) where Yhat=X put into the regression equation specificity = 1-error |
|
|
What is Content Validity?
For what kind of tests is it important |
Content validity is important for:
1. Mastery tests (how much does S know about a specific Domain - like the EPPP) 2. Achievement tests (same as mastery?) 3. Developing tests of the KSA needed for a particular job (Use Job Analysis to determine the KSA) 4. job samples It can be evaluated by asking "experts" to examine the test and see if it seems representative of the knowledge in the domain. AND by testing to see: Does it discriminate between novices and experts in the field |
Content Validity =
1. Discrimination between experts and novices 2.Experts evaluation of test, 3. Correltion w/ established measures, Content Validity=DEC |
|
What are the 3 mnemonics for teh 3 types of validity
|
Construct DCFM towers
Criterion is PC Content = ED and content validity is often all that elementary and secondary mastery tests are about |
Think about progression of my time at AU
Construct(ion) of DCFM towers on the way in Criterion=PC (the criterion for being liked on college campuses is are you PC) ED content = what was the content of my education at AU? |
|
What is the nature of the relationship between a test's reliability and validity
|
A test's validity coefficient cannot exceed the square root of its reliability coefficient
|
Memorize it!
Write out the formula for this to help figure out difficultly worded questions! Hints: 1. reliability (r2xx) is already squared 2. the relaionship involved involves taking the square root of something |
|
In terms of percentile ranks, where do 1sd, 2sd and 3sd above the mean fall?
|
84
97.7 99 |
different from normal distribution
|
|
If the assumptions for a One-Way ANOVA are violated, what test could you use instead
|
Kruskal-Wallis
|
|
|
What are the 3 assumptions when using a parametric test?
|
1. NORMAL POPULATION DISTRIBUTION (not too serious as long as can compensate by using sample size >= 25-30)
HOMOSCEDASTICITY - variances of the two populations are equal - if you violate this assumption, it's usally not too serious) 3. INDEPENDENT OBSERVATIONS* very serious - Incrases likelihood of Type I and Type II error Independent Observations |
|
|
What 2 assumptions do Parametric and Non-parametric tests share?
|
1. Random Selection from Population
2. Independent Observations |
|
|
If you have ordinal data w/ 2 IV's what test do you use to see if there are stat. sig differences between the groups?
|
Mann-Whitney U
|
|
|
If you have Nominal Data, what test do you use to compare cell means?
|
Chi-Square
|
|
|
If the assumption or normality is violated for a one-way ANOVA, what non-parametric test could you use instead?
|
Kruskal-Wallis
|
|
|
Non-parametric test to compare 2 indpendent groups?
|
Mann-Whitney U
|
U looks like two lines of data if you eliminate the bottom
|
|
Non-parametric test to compare 2 dependent (repeated-measures) groups?
|
Wilcoxan Matched Pairs Signed Rank Test
|
|
|
What test would you substitute for an independent measures t-test if the assumption of normality were violated or the data were ordinal?
|
Mann-Whitney U
|
|
|
What test would you substitute for a repeated measures t-test (or RM ANOVA w/ 1 IV) if the data were Ordinal or if the assumption of normality was violated?
|
Wilcoxan Matched-pairs signed rank test
|
|
|
If you want to use 2 or more continuous predictor variables (IV's) to predict preformance on nominal criterion (e.g. Yes/No or DSM-IV TR Dx.) what test should you use?
|
Discriminant Function Analysis
|
|
|
If the relationship between 2 or more discreet IV's (not contiuous) and a continuous Y variable, what type of test should you use?
|
Logistic Regression
|
|
|
If you want to test a pre-defined theory about causal relationship(s) between two vairables, and the relationship goes in one way (X causes Y, but not vice versa), what test (type of modling) should you use?
|
Path Analysis
|
Think of Life Path - it goes only one way
|
|
For what is the Spearman-Brown formula used?
|
correction for split-half reliability calculation
|
Split half - Spearman Brown - alliteravie,
rhythm of saying them together |
|
If you are assessing inter-item consistency (reliability), and variable is continuous, you would use:
|
Cronbach's alpha
|
e.g. score on MOCI for MA Thesis
|
|
If you are assessing inter-item consistency and the variable is dichotomous (e.g. yes/no), what would you use to correct for the overestimaton you would expect using Cronbach's alpha?:
|
Kuder Richardson-20
|
Kuder-Richardson 20 has a 2 in it and the DV on this kind of test has only 2 possible answers
|
|
To assess inter-rater reliability, what two things could you use?
|
1. simple percent agreement
2. Kappa statistic |
|
|
How can you increase a test's reliability?
|
1. Increase Homogenity of test-items
2. Increase Heterogeneity of Subjects taking test 3. Increase test length |
|
|
What are the three types of Validity
|
CRITERION (use regression, Standard Error of Estimate)
1. Predicdtive 2. Concurrent CONTENT (Ask an expert what s/he thinks) 1. Expert's opinion 2. Does it discriminate between experts and novices in the field? CONSTRUCT Two Types 1. Convergent 2. Discriminant Two Ways to Assess 1. Factor Analysis 2. Multi-trait, Multi-method |
Three C's w/ two types of validity each
1. Content (expers' opinion) 2. Construct convergent validity, discriminant validity factor analysis; multi-trait, multimethod 3. Criterion Predictive Validity (regression, Standard Error of Estimate) Concurrent Validity |
|
Validity of a Predictor is limited by it's
|
Reliability
Validity <= (sqrt Reliability) |
Think about the formula for calculating the Standard Error of Estimate (for Confidence Interval's when predicting score on Criterion from predictor using a regression equation)
SEest = SDy*[1-(sqrt r2xy)] |
|
Validity of Predictor is less than or equal to the sqrt
and is less than or = to: |
Validity of Predictor (rxy) is less than or equal to the square root of it's reliability of the predictor
rxy <= sqrt(r2xx) and is less than or equal to the square root of the product of the reliability of the predictor times the reliability of the criterion rxy <= sqrt(r2xx*r2yy) |
|
|
If you want to calculate the correlation between two variables which are both continuous, which of the following would you use?
Biserial Point Biserial Rho Pearson r |
Pearson r
|
|
|
If you want to calculate the correlation between 2 variables and one is continuous and the other is a true dichtomoy, what test would you use?
|
Point Biserial
|
If something is pointed, it is specific
|
|
If you want to calculate the relationship between two rank-ordered variables, which of the following would you use?
Pearson r Biserial Point Biserial rho |
rho
|
|
|
If you want to calculate the relationship between two variables, one of which is continuous and the other of which is an artificial dichotomy
|
Bserial
|
|
|
In Item Response Theory, when one creates an Item Characteristic Curve (ICC), what is on the X-axis and what is on the Y-axis
|
X-axis: S's ability level (low to high)
Y-axis: % of S's who get item right |
|
|
When using ICC, what does the slope indicate?
|
Disciminative ability for S's high and low in the ability being assessed (sharp slope discriminates well, flat slope does not)
|
|
|
When using ICC, how do you know what the chance of guessing correctly is?
|
Y-intercept
|
|
|
When using ICC, how do you determine how difficult an item is?
|
at what level of ability (where on the X-axis) do 50% of S's get the item right
|
|