• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/24

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

24 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)
Reliability Coefficient
Measure of how much obtained score is true ability
-Interpret directly (70% means 70% is true ability, 30% error)
-A good test should have at least 0.7 or higher
A measure of how much score is actual.....
Classical Test Theory
Results are:
1. True Score (Ability)
-true variance
2. Some Error (Fatigue...)
-error variance
Two parts...
Reliablity
-Establish reliability first (Test can be reliable but not valid.)
-Consistency
Do you get the same results each time?
Validity
-Accuracy
Does test measure what it is supposed to?
Validity can not exceed...
the square root of reliablity
Types of Reliability
1. Test-retest reliability (Coefficient of stability)
2. Alternate Forms (Considered the best but least used)
3. Internal Consistency (Compares test against itself)
T-A-I
Types of Internal Consistency Reliablity
1. Split-Half (split test, problem is restricted range)
-can use Spearman-Brown Prophecy Formula to make it like 2 tests
2. Inter-Item Consistency (compare items on one test one against the other in a systematic way)
-can use Cronbach's Alpha (compare items on test individually against all others systematically) or Kuder Richardson Formula 20 (special version of Cronbach, use when you have true/false or yes/no dichotomous test items)
S-I
Kappa Coefficient
Inter rater reliability
IRR
Standard Error of Measurement
-Based on reliability coefficient
-Try to get an idea of what a person's true ability is
-Based on a person's single score but has properties of a normal curve
-the more reliable the test, the less the SE of measurement
based on another coefficient,<br />
testing issue
Standard Error of Mean
-How will sample represent population?
It is best to have ___________ items and _____________ test takers for a test to be most reliable.
Homogeneous
Heterogeneous
Content Validity
Based on expert judgement
-academic tests
Ex: the EPPP
Criterion-Related Validity
Outcome
-look at relationship between predictor and outcome
-used most often in personnel psych (predicting job performance, etc)
-two types are predictive validity (who will become schizophrenic?, predicts future behavior) and concurrent validity (who is schizophrenic now?, test results NOW)
OUTCOME
-two types
Construct Validity
Can not directly define
-Two types are convergent (compare new test with established test that measures same construct) and divergent (discriminant validity - you want your test to have nothing in common with another test of a different construct)
Can not define
-two types
Multitrait-Multimethod Matrix
If it's a single trait, will establish convergent validity - need at HIGH monotrait number to establish convergent validity
-If it's a heterogeneous trait, will need a low trait number to establish divergent validity
single or heterogeneous traits/trait numbers/convergent or divergent validity
Face Validity
Does the test make sense to the people who are taking it?
Just what it says :)
Cross Validation
Give test instrument again and again
-Shrinkage may occur (range of scores will shrink slightly when you initially cross validate instruments)
Criterion-related Validity
OUTCOME
-job performance in the future
OUTCOME
Incremental Validity
Can we increase that number of correct decisions we are already making?
Three things to establish Incremental Validity
1. base rate - moderate (number of decisions you are already making correctly)
2. selection ratio - need low selection ratio (number of jobs available to number of applicants)
3. validity coefficient - high validity on predictor and criterion
B-S-V
Criterion-Referenced Scores
-Do not compare score to anyone else, just meeting a standard
Ex: Driver's license test
Norm-Referenced Scores
-Score is compared to other individuals
-Two types: percentile ranks (not used as much now) and standard scores (transformed scores that allow you to compare)
Exs: Grade Equivalents, z-scores
Floor Effect
-bunch of test takers at bottom of test range
-need to have enough easy items
Ceiling Effect
-need to have enough difficult items to discriminate between best test takeers