• Shuffle
Toggle On
Toggle Off
• Alphabetize
Toggle On
Toggle Off
• Front First
Toggle On
Toggle Off
• Both Sides
Toggle On
Toggle Off
Toggle On
Toggle Off
Front

### How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

Play button

Play button

Progress

1/24

Click to flip

### 24 Cards in this Set

• Front
• Back
• 3rd side (hint)
 Reliability Coefficient Measure of how much obtained score is true ability -Interpret directly (70% means 70% is true ability, 30% error) -A good test should have at least 0.7 or higher A measure of how much score is actual..... Classical Test Theory Results are: 1. True Score (Ability) -true variance 2. Some Error (Fatigue...) -error variance Two parts... Reliablity -Establish reliability first (Test can be reliable but not valid.) -Consistency Do you get the same results each time? Validity -Accuracy Does test measure what it is supposed to? Validity can not exceed... the square root of reliablity Types of Reliability 1. Test-retest reliability (Coefficient of stability) 2. Alternate Forms (Considered the best but least used) 3. Internal Consistency (Compares test against itself) T-A-I Types of Internal Consistency Reliablity 1. Split-Half (split test, problem is restricted range) -can use Spearman-Brown Prophecy Formula to make it like 2 tests 2. Inter-Item Consistency (compare items on one test one against the other in a systematic way) -can use Cronbach's Alpha (compare items on test individually against all others systematically) or Kuder Richardson Formula 20 (special version of Cronbach, use when you have true/false or yes/no dichotomous test items) S-I Kappa Coefficient Inter rater reliability IRR Standard Error of Measurement -Based on reliability coefficient -Try to get an idea of what a person's true ability is -Based on a person's single score but has properties of a normal curve -the more reliable the test, the less the SE of measurement based on another coefficient,
testing issue Standard Error of Mean -How will sample represent population? It is best to have ___________ items and _____________ test takers for a test to be most reliable. Homogeneous Heterogeneous Content Validity Based on expert judgement -academic tests Ex: the EPPP Criterion-Related Validity Outcome -look at relationship between predictor and outcome -used most often in personnel psych (predicting job performance, etc) -two types are predictive validity (who will become schizophrenic?, predicts future behavior) and concurrent validity (who is schizophrenic now?, test results NOW) OUTCOME -two types Construct Validity Can not directly define -Two types are convergent (compare new test with established test that measures same construct) and divergent (discriminant validity - you want your test to have nothing in common with another test of a different construct) Can not define -two types Multitrait-Multimethod Matrix If it's a single trait, will establish convergent validity - need at HIGH monotrait number to establish convergent validity -If it's a heterogeneous trait, will need a low trait number to establish divergent validity single or heterogeneous traits/trait numbers/convergent or divergent validity Face Validity Does the test make sense to the people who are taking it? Just what it says :) Cross Validation Give test instrument again and again -Shrinkage may occur (range of scores will shrink slightly when you initially cross validate instruments) Criterion-related Validity OUTCOME -job performance in the future OUTCOME Incremental Validity Can we increase that number of correct decisions we are already making? Three things to establish Incremental Validity 1. base rate - moderate (number of decisions you are already making correctly) 2. selection ratio - need low selection ratio (number of jobs available to number of applicants) 3. validity coefficient - high validity on predictor and criterion B-S-V Criterion-Referenced Scores -Do not compare score to anyone else, just meeting a standard Ex: Driver's license test Norm-Referenced Scores -Score is compared to other individuals -Two types: percentile ranks (not used as much now) and standard scores (transformed scores that allow you to compare) Exs: Grade Equivalents, z-scores Floor Effect -bunch of test takers at bottom of test range -need to have enough easy items Ceiling Effect -need to have enough difficult items to discriminate between best test takeers