Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

18 Cards in this Set

  • Front
  • Back
consistency, a test is reliable to the degree it provides repeatable, consistent results.
Types: test-retest, alternate forms, internal consistency
a test is valid to the degree it measures what it purports to measure.
tests of maximum performance vs. typical performance
tell about examinee's best possible performance (achievement) vs. what examinee usually does/feels (personality).
power vs. mastery test
power test assesses level of difficulty a person can attain (no time limits) vs. attaining a pre-established level of acceptable performance (pass/fail - often used for basic skills tests)
ipsative vs. normative measures
ipsative measure uses self as the frame of reference while normative measures provide strengh of each attribute compared to others
true score vs. measurement error
-classical test theory
-test reliable to degree that score reflects true score rather than error
-always some degree of error, no test perfectly reliable
reliability coefficent
-method of estimating test's reliability
-range from 0.0 to +1.0
-.0 means 90% variability in test score due to true score differences among examinees and 10% represents measurement error
test-retest reliability
a.k.a. coefficient of stability
-affected time factors & provide sources of error
-not typically recommended for reliability testing
alternative forms reliability
a.k.a. coefficient of equivalence
-administering two equivalent forms of test to same group of examinees then corelating two sets of scores
-considered best one to use although coefficent tends to be lower than test-retest (due to content differences & time passage)
internal consistency reliability
a.k.a. coefficient of internal consistency
-obtaining correlations among individual items
-methods: split-half, Cronbach's alpha, Kuder-Richardson
split-half reliability
dividing test in two and corelating halvs as if two shorter tests
-Spearman-Brown formula can overcome the lowering of coefficient due to decreasing length (thus lowering reliability)
Cronbach's coefficient alpha vs. Kuder Richardson
-both recommended over split-half
-indicate average degree of inter-item consistency
-Cronbach's used for tests with multiple scored items (likert choices)
-Kuder-Richardson when test items are dichotomously scored (yes/no, T/F)
major source of measurement error for internal consistency reliability coefficients?
-content sampling or item heterogeneity: degree items are different in terms of content sampled
measures of internal consistency good for assessing what and bad for assessing what?
good: unstable traits
bad: speed tests (inflated), use test-retest/alternate forms instead
interscorer (inter-rater) reliability
-calculating correlation coefficient bt scoes of two different raters
-kappa coefficient: measure of agreement bt two judges who rate set of objects using nominal scales
standard error of measurement
-how much error an individual test score can be expected to have
-used to construct confidence intervals
confidence intervals
-68% probability true score falls within +- one std error of meas
-95%: within +-2 std error of meas
-98%: within +- 3 std error of meas
factors affecting reliability
-length of test (longer>reliability)
-more homogeneous group taking test, lower reliability
-floor/ceiling effects decreases reliability
-T/F < Multiple Choice, if can guess than lowers reliability