Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
18 Cards in this Set
- Front
- Back
reliability
|
consistency, a test is reliable to the degree it provides repeatable, consistent results.
Types: test-retest, alternate forms, internal consistency |
|
validity
|
a test is valid to the degree it measures what it purports to measure.
|
|
tests of maximum performance vs. typical performance
|
tell about examinee's best possible performance (achievement) vs. what examinee usually does/feels (personality).
|
|
power vs. mastery test
|
power test assesses level of difficulty a person can attain (no time limits) vs. attaining a pre-established level of acceptable performance (pass/fail - often used for basic skills tests)
|
|
ipsative vs. normative measures
|
ipsative measure uses self as the frame of reference while normative measures provide strengh of each attribute compared to others
|
|
true score vs. measurement error
|
-classical test theory
-test reliable to degree that score reflects true score rather than error -always some degree of error, no test perfectly reliable |
|
reliability coefficent
|
-method of estimating test's reliability
-range from 0.0 to +1.0 -.0 means 90% variability in test score due to true score differences among examinees and 10% represents measurement error |
|
test-retest reliability
|
a.k.a. coefficient of stability
-affected time factors & provide sources of error -not typically recommended for reliability testing |
|
alternative forms reliability
|
a.k.a. coefficient of equivalence
-administering two equivalent forms of test to same group of examinees then corelating two sets of scores -considered best one to use although coefficent tends to be lower than test-retest (due to content differences & time passage) -costly |
|
internal consistency reliability
|
a.k.a. coefficient of internal consistency
-obtaining correlations among individual items -methods: split-half, Cronbach's alpha, Kuder-Richardson |
|
split-half reliability
|
dividing test in two and corelating halvs as if two shorter tests
-Spearman-Brown formula can overcome the lowering of coefficient due to decreasing length (thus lowering reliability) |
|
Cronbach's coefficient alpha vs. Kuder Richardson
|
-both recommended over split-half
-indicate average degree of inter-item consistency -Cronbach's used for tests with multiple scored items (likert choices) -Kuder-Richardson when test items are dichotomously scored (yes/no, T/F) |
|
major source of measurement error for internal consistency reliability coefficients?
|
-content sampling or item heterogeneity: degree items are different in terms of content sampled
|
|
measures of internal consistency good for assessing what and bad for assessing what?
|
good: unstable traits
bad: speed tests (inflated), use test-retest/alternate forms instead |
|
interscorer (inter-rater) reliability
|
-calculating correlation coefficient bt scoes of two different raters
-kappa coefficient: measure of agreement bt two judges who rate set of objects using nominal scales |
|
standard error of measurement
|
-how much error an individual test score can be expected to have
-used to construct confidence intervals |
|
confidence intervals
|
-68% probability true score falls within +- one std error of meas
-95%: within +-2 std error of meas -98%: within +- 3 std error of meas |
|
factors affecting reliability
|
-length of test (longer>reliability)
-more homogeneous group taking test, lower reliability -floor/ceiling effects decreases reliability -T/F < Multiple Choice, if can guess than lowers reliability |