Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
21 Cards in this Set
- Front
- Back
Item analysis: Item Difficulty
|
p= total # of Examinees passing the item/total # of examinees
**p ranges 0 to 1.0, larger values indicate easier items, when 0-->no one answered correctly, when 1, --> answered correctly by everyone **p=.5 is optimal except for True/False test, when p should =.75 |
|
Reliability Coefficient
|
-provides an estimate of true score variability (dependability)
-ranges for 0.0 (non-existent) to +1.0 (perfect) -r (xx subscript) |
|
Methods for Assessing Reliability: Test-Retest Reliability
|
-appropriate for measuring attributes that are relatively stable over time (like aptitude vs mood)
|
|
Methods for Assessing Reliability: Alternate (Equivalent, Parallel) Forms of Reliability (Coefficient of Equivalence)
|
-2 equivalent forms of the test are administered to the same group
-it's the most thorough method for estimating reliability |
|
Methods for Assessing Reliability: Internal Consistency Reliability
|
####(Not appropriate for speed tests)
*Split-Half (reliability tends to decrease as the length of a test decreases) --use Spearman-Bronw prophecy formula, which *Coefficient Alpha (administer test once to a single group...it's the average reliability that would be obtained from all possible splits of the test) --can be conservative, variation of Coefficient Alpha known as KR-20 can be used) **inter-rater (inter-scorer, inter-observer) reliability...depends on a rater's judgment) --can use the kappa statistic, a correlation coefficient) |
|
Factors that Affect the Reliability Coefficient
|
-Test Length: longer, the larger the test's reliability coefficient
-Range of Test Scores (when examinees are heterogenous, range is maximized) -Probability of answering correctly by guessing |
|
Standard Error of Measurement
VS Standard Error of Estimate |
SEM
-an index of the amount of error that can be expected in obtained scores due to the unreliability of the test -used to construct a confidence interval around an examinee's obtained test score -need standard deviation and reliability coefficient SEE -index of error when predicting criterion scores from predictor scores |
|
When is content (achievement type tests) or construct validity (intelligence) important
|
when test scores provide information on how much each examinee knows about a content domain
OR The status with regard to trait being measured |
|
When is criterion-related validity (SAT) important
|
when test scores (X) will be used to predict scores on some other measure (Y) and it is the scores on Y that are of most interest
|
|
Convergent and Discriminant Validity
|
-Correlate test scores with scores on measures that do and do not purport to assess the same trait
-high correlations with measures of the same trait provide evidence of the test's CONVERGENT validity -low correlations with measures of unrelated characteristics provide evidence of the test's DISCRIMINANT (DIVERGENT) validity -in the form of multitrait-multimethod matrix |
|
Monotrait-monomethod coefficients
|
-reliability coefficient (parentheses)
|
|
Monotrait-heteromethod coefficients
|
when these coefficients are large, they provide evidence of convergent validity
(rectangles) |
|
Heterotrait-monomethod
|
when small, test has discriminant validity (ellipses)
|
|
heterotrait-heteromethod
|
when small, provide evidence of discriminant validity
|
|
Factor Analysis: Key Words: Communality
|
test's reliability will always be at least as large as its communality
|
|
Factor Analysis: most important things to remember
|
-squaring a factor provides a measure of shared variability
-when factors are orthogonal, a test's communality can be calculated by squaring and adding the test's factor loadings -ORTHOGONAL=independent, uncorrelated -OBLIQUE=Dependent, correlated |
|
Criterion-Related Validity
|
-making predictions, like SAT scores with GPA
-SAT is predictor, while GPA is criterion |
|
Criterion-Related Validity:
CONCURRENT VS PREDICTIVE VALIDITY |
Concurrent:
-collect predictor data (SAT) and criterion data (GPA) simultaneously -good estimate, immediate, like if already a student or already hired Predictive: -measure criterion (GPA) after the predictor data (SAT) -to really predict b4 hiring or accepting |
|
If an exam question gives you the correlation coefficient for X and Y and asks how much variability in Y is explained by X...
|
you need to square the correlation coefficient to obtain the correct answer
|
|
Incremental Validity
|
-the increase in correct decisions that can be expected if the predictor is used as a decision-making tool
|
|
Relationship btwn Reliability and Validity
|
reliability-->consistency with the repeatability of your measurement
validity-->accuracy, strength of our conclusions, inferences or propositions -when a test has low reliability, it can't have a high degree of validity -a high reliability doesn't guarantee validity |