• Shuffle
Toggle On
Toggle Off
• Alphabetize
Toggle On
Toggle Off
• Front First
Toggle On
Toggle Off
• Both Sides
Toggle On
Toggle Off
Toggle On
Toggle Off
Front

### How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

Play button

Play button

Progress

1/21

Click to flip

### 21 Cards in this Set

• Front
• Back
 Kappa Statistic - Think of... Inter-Rater Reliability Item Difficulty: # got answer correct/# who answered question .50 is ideal, .75 ideal for t/f "D" Item Discrimination: From -1.0 to 1.0, ability to discriminate between high and low. High group gets them approach 1.0, low group gets them: approach -1.0 True + False + True - False - T+ = Hi Predict, Hi Crit F+ = Hi Predict, Low Crit T- = Low Predict, Low Crit F- = low predict, low critertion Criterion: True/False Predictor: Positive/negative Examples of Norm Referenced: Percentile Z-Score (=0 @1) T-Score (=50 @ 10) Relevance: Extent that test items contribute to goals of testing. Reliability: Ways to get it: AKA Consistency: Test-Retest Alternate Forms (ie, A & B form WJ3) Split-Half (Spearman/Brown) Coefficient Alpha (KR-20) Inter-Rater (Kappa Stat) Std Error Measurement: For measurement error: CI around the test score. Std Error Estimate: Error for predicting criterion from predictor. CI around criterion. 2 Types of Construct Validity: CV: Extent test measure the hypothetical construct (ie, intelligence, etc etc) Convergent: Correlates with other measures of same construct. Discriminant: Correlates with other measures that dont correlate. Criterion Referenced Interpretation: Examples: Interpret scores by pre-specified std. % correct Regression Equation Expectancy Table Content Validity: Extent test samples domain of info/knowledge/skill it claims to. Determined by expert judgment. Important for achievement/ job sample. CF: Construct validity. Classical Testing Theory: Variability reflects combo of TRUE SCORE DIFFERENCES and EFFECTS of ERROR (measurement, etc) Thus: Reliablity = Measurement of true score. (80% true score, 20% error) For test item with discrimination index (D) of +1: a. high achievers more correct than low b. low more correct than high c. low and high to be equally correct d. low and high to be equally incorrect. High get more (all) correct. Internal Consistency on Dichotomous: a Spearman-Brown b. Kappa Stat c. KR-20 d. coefficient of concordance KR-20: Best for dichotomous (t/f, right/wrong, etc) to make a coefficient alpha stronger: a. add more similar items b. add more heterogenuous items c. use true/false d. all of the above Add more similar to increase alpha coefficient: IE: ask the same question more, ask more q's about US history and you'll get a more accurate test of knowledge, than a ten question test. Multimodal Multimethod Matrix: For validity you want: a. MM low and HH high b MH high, HM low c. MM high, HM low d. MM High, HH low MH High, HM Low Preventing Criterion Contamination: a. keep raters independent from each other b. make sure they dont have predictor's scores c. make them aware of possible biases through training d. Make sure they dont have predictors scores Oblique Rotation for FA if: a. assessing construct validity of single trait test b. if he believe constructs in the analysis are correlated c. determining factorial validity d. determining reliability Oblique: looking for correlations between constructs measured by the test. Orthogonal vs Oblique Orthogonal: uncorrelated, independent Oblique: Correlated Shared variability calculated by squaring factor load