• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/21

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

21 Cards in this Set

  • Front
  • Back
Kappa Statistic - Think of...
Inter-Rater Reliability
Item Difficulty:
# got answer correct/# who answered question

.50 is ideal, .75 ideal for t/f
"D" Item Discrimination:
From -1.0 to 1.0, ability to discriminate between high and low.
High group gets them approach 1.0, low group gets them: approach -1.0
True +
False +

True -
False -
T+ = Hi Predict, Hi Crit
F+ = Hi Predict, Low Crit

T- = Low Predict, Low Crit
F- = low predict, low critertion

Criterion: True/False
Predictor: Positive/negative
Examples of Norm Referenced:
Percentile

Z-Score (=0 @1)

T-Score (=50 @ 10)
Relevance:
Extent that test items contribute to goals of testing.
Reliability:

Ways to get it:
AKA Consistency:

Test-Retest
Alternate Forms (ie, A & B form WJ3)
Split-Half (Spearman/Brown)
Coefficient Alpha (KR-20)
Inter-Rater (Kappa Stat)
Std Error Measurement:
For measurement error:
CI around the test score.
Std Error Estimate:
Error for predicting criterion from predictor.

CI around criterion.
2 Types of Construct Validity:
CV: Extent test measure the hypothetical construct (ie, intelligence, etc etc)

Convergent: Correlates with other measures of same construct.

Discriminant: Correlates with other measures that dont correlate.
Criterion Referenced Interpretation:
Examples:
Interpret scores by pre-specified std.

% correct
Regression Equation
Expectancy Table
Content Validity:
Extent test samples domain of info/knowledge/skill it claims to.

Determined by expert judgment.
Important for achievement/ job sample.

CF: Construct validity.
Classical Testing Theory:
Variability reflects combo of TRUE SCORE DIFFERENCES and EFFECTS of ERROR (measurement, etc)

Thus: Reliablity = Measurement of true score. (80% true score, 20% error)
For test item with discrimination index (D) of +1:

a. high achievers more correct than low
b. low more correct than high
c. low and high to be equally correct
d. low and high to be equally incorrect.
High get more (all) correct.
Internal Consistency on Dichotomous:
a Spearman-Brown
b. Kappa Stat
c. KR-20
d. coefficient of concordance
KR-20: Best for dichotomous (t/f, right/wrong, etc)
to make a coefficient alpha stronger:
a. add more similar items
b. add more heterogenuous items
c. use true/false
d. all of the above
Add more similar to increase alpha coefficient: IE: ask the same question more, ask more q's about US history and you'll get a more accurate test of knowledge, than a ten question test.
Multimodal Multimethod Matrix: For validity you want:
a. MM low and HH high
b MH high, HM low
c. MM high, HM low
d. MM High, HH low
MH High, HM Low
Preventing Criterion Contamination:
a. keep raters independent from each other
b. make sure they dont have predictor's scores
c. make them aware of possible biases through training
d.
Make sure they dont have predictors scores
Oblique Rotation for FA if:
a. assessing construct validity of single trait test
b. if he believe constructs in the analysis are correlated
c. determining factorial validity
d. determining reliability
Oblique: looking for correlations between constructs measured by the test.
Orthogonal
vs
Oblique
Orthogonal: uncorrelated, independent

Oblique: Correlated
Shared variability
calculated by squaring factor load