 Shuffle Toggle OnToggle Off
 Alphabetize Toggle OnToggle Off
 Front First Toggle OnToggle Off
 Both Sides Toggle OnToggle Off
 Read Toggle OnToggle Off
Reading...
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
Play button
Play button
41 Cards in this Set
 Front
 Back
Relevance

Refers to the extent to which test items contribute to achieving the stated goals of testing.
Factors that influence: Content Appropriateness  Does the item assess the domain it is designed to evaluate? Taxonomic Level  Does the item reflect the appropriate cognitive level? Extraneous Abilities  Does it assess other abilities? 

Item Difficulty

Passing examinees/total examinees. P ranges from 01. .5 for most tests. .75 for T/F tests.


Item Discrimination

The extent to which a test items discriminates b/t examinees with low and high scores. D=UL. Ranges from 1 to 1. Discrimination = Upper groupLower group. (Most test look for .35 or .5)


Item Charactersitic Curve (ICC)

In item response theory, the curve is made for each item and assesses the probability that the respondent will get the item correct. Diff level, discrimination, and guessing are accounted in the ICC.
An ICC provides 13 pieces of information about a test item – its difficulty (the position of the curve (left versus right); its ability to discriminate between high and low scorers (the slope of the curve); and the probability of answering the item correctly just by guessing (the Yintercept). 

Classic Test Theory

X=T+E


Reliability Coefficient

A correlation coefficient (ranging from 01) which assesses true scores vs error scores. Interpret directly; no need to sq. Increased when similar items are added, the range of scores is unrestricted, and guessing is reduced.


TestRetest Reliability

Known as the coefficient of stability. Good for tests that are relatively stable over time and not affected by repeated measurement.


Alternative Forms Reliability

Good for tests that are stable. Most thorough method for estimating reliability.


Internal Consistency Reliability

Not appropriate for speeded tests. Splithalf & Coefficient alpha are 2 types.


SplitHalf Reliability

Two halves. Must have enough items or power is low. SB helps when low.


SpearmanBrown Prophecy Formula

Estimates reliability on short splithalf samples.


Coefficient Alpha

Method of assessing internal consistency reliability (i.e., special formula) that provides an index of average interitem consistency.


KuderRichardson 20 (KR20)

A substitute for coefficient alpha when test items are scored dichotomously (right or wrong).


InterRater Reliability

Diff raters assessment. Assessed using kappa statistic.


Standard Error of Measurement (SEM)

SD times the Sq rt of 1reliability coef. Makes the confidence interval. Ex: 68% confidence interval would be one standard error on both sides of the actual score.


Content Validity

The extent that a test adequately samples the content that it is designed to measure.


Construct Validity

The extent that a test measures the hypothetical trait.


Converget and Divergent Validity

Methods to assess construct validity (multimethodmultitrait or factor analysis).


MultitraitMultimethod

"Mono" means same and "Hetero" means different. Same traitdiff methods coeff are large = convergent validity. Diff traitssame method coff are small = discriminant validity.


Factor Analysis

Identifies the min # of common factors accounting for a set of tests. Factors can be sq. Communality is the % of accountability by the factors.


Orthogonal Rotation

FA rotation resluting in seperate or uncorrelated factors.


Oblique Rotation

FA rotation resulting in similar or correlated factors.


CriterionRelated Validity

When test scores are to be used to draw conclusions about an examinee's likely standing an another measure.


Concurrent Validity

When criterion data are collected prior to or during the predictor.


Predictive Validity

When the criterion data are collected after the predictor.


Standard Error of Estimate

Index of error when predicting criterion scores from a predictor score. Uses Criterions SD and predictors validity coef. SD tiems the sq rt of 1  validity coefficent


Incremental Validity

The extent to which a predictor increases decisionmaking accuracy. (Positive Hit RateBase Rate)


True Positive

ID by predictor; meet criterion


False Positive

ID by predictor; do not meet criterion


True Negative

No ID by predictor; do not meet criterion


Predictor

Determines if a S is +/


False Negative

No ID by predictor; do meet criterion


Criterion

Determines of a S is T/F


Base Rate

True Positives + False Negatives/Total #


Positive Hit Rate

True Positives/Total Positives


Criterion Contamination

When a criterion rater knows the predictor score.


When cross validating...

...shrinkage may occur.


NormReferenced Interpretation

Norms must be derived from individuals w/ similar characteristics to be valid. Norms become obsolete quickly.


Low reliability means low validity, but...

...high reliability does not always mean high validity.


Percentile Ranks

Nonlinear. Distribution is always flat regardless of the shape of the raw scores. Disadvantage is that it is an ordinal scale.


CriterionReferenced Interpretation

Interpreting against a predetermined standard using either a percentage score or criterion score from an regression eqution and expectancy table.
