Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

21 Cards in this Set

  • Front
  • Back
Kappa Statistic - Think of...
Inter-Rater Reliability
Item Difficulty:
# got answer correct/# who answered question

.50 is ideal, .75 ideal for t/f
"D" Item Discrimination:
From -1.0 to 1.0, ability to discriminate between high and low.
High group gets them approach 1.0, low group gets them: approach -1.0
True +
False +

True -
False -
T+ = Hi Predict, Hi Crit
F+ = Hi Predict, Low Crit

T- = Low Predict, Low Crit
F- = low predict, low critertion

Criterion: True/False
Predictor: Positive/negative
Examples of Norm Referenced:

Z-Score (=0 @1)

T-Score (=50 @ 10)
Extent that test items contribute to goals of testing.

Ways to get it:
AKA Consistency:

Alternate Forms (ie, A & B form WJ3)
Split-Half (Spearman/Brown)
Coefficient Alpha (KR-20)
Inter-Rater (Kappa Stat)
Std Error Measurement:
For measurement error:
CI around the test score.
Std Error Estimate:
Error for predicting criterion from predictor.

CI around criterion.
2 Types of Construct Validity:
CV: Extent test measure the hypothetical construct (ie, intelligence, etc etc)

Convergent: Correlates with other measures of same construct.

Discriminant: Correlates with other measures that dont correlate.
Criterion Referenced Interpretation:
Interpret scores by pre-specified std.

% correct
Regression Equation
Expectancy Table
Content Validity:
Extent test samples domain of info/knowledge/skill it claims to.

Determined by expert judgment.
Important for achievement/ job sample.

CF: Construct validity.
Classical Testing Theory:
Variability reflects combo of TRUE SCORE DIFFERENCES and EFFECTS of ERROR (measurement, etc)

Thus: Reliablity = Measurement of true score. (80% true score, 20% error)
For test item with discrimination index (D) of +1:

a. high achievers more correct than low
b. low more correct than high
c. low and high to be equally correct
d. low and high to be equally incorrect.
High get more (all) correct.
Internal Consistency on Dichotomous:
a Spearman-Brown
b. Kappa Stat
c. KR-20
d. coefficient of concordance
KR-20: Best for dichotomous (t/f, right/wrong, etc)
to make a coefficient alpha stronger:
a. add more similar items
b. add more heterogenuous items
c. use true/false
d. all of the above
Add more similar to increase alpha coefficient: IE: ask the same question more, ask more q's about US history and you'll get a more accurate test of knowledge, than a ten question test.
Multimodal Multimethod Matrix: For validity you want:
a. MM low and HH high
b MH high, HM low
c. MM high, HM low
d. MM High, HH low
MH High, HM Low
Preventing Criterion Contamination:
a. keep raters independent from each other
b. make sure they dont have predictor's scores
c. make them aware of possible biases through training
Make sure they dont have predictors scores
Oblique Rotation for FA if:
a. assessing construct validity of single trait test
b. if he believe constructs in the analysis are correlated
c. determining factorial validity
d. determining reliability
Oblique: looking for correlations between constructs measured by the test.
Orthogonal: uncorrelated, independent

Oblique: Correlated
Shared variability
calculated by squaring factor load