• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/49

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

49 Cards in this Set

  • Front
  • Back
(2) are dependent on the quality of the items
reliability and validity
(2) approaches to item analysis
qualitative
quantitative
item difficulty index(p) =
only applicable to _______tests
ranges from ____to ___
easier items have a ____decimal number
harder items have a ____decimal number
# of examinees correctly answering the item/#examinees
max performance
0-1
larger
smaller
T or F
Items with p values of either 0.0 or 1.0 provide no information about individual differences and are of no use from a measurement perspective
T
Item difficulty is dependent on the ___
sample
For maximizing variability and reliability, the optimal item difficulty is ____
0.50
*Not necessary for ALL items to have a difficulty level of 0.50, Often desirable to select some items with a difficulty level below and above 0.50, but with a mean of 0.50
-T or F Different levels are desirable in different testing situations
-constructed response items, ___is typically the optimal level? selected response items?
T
0.50 ; optimal level varies due to influence of guessing
-in criterion reference tests the expectation is that most test takers will..
-common for items to have p values as high as ___
eventually be successful
0.90
Percent Endorsement Statistic=
for ____tests
dependent on the ____
-percentage of examinees that responded to an item in given manner
-typical response tests
-sample
Item Discrimination=
how well an item differentiates among test takers who differ on the construct being measured.
more than than ___ different indexes of item discrimination have been developed
50
-Discrimination Index (D)
-two groups are typically defined in terms of___performance
-common approach used?
-difference in performance between two groups
-total test
-select the top and bottom 27% of test takers in terms of
their overall performance on the test, and exclude the middle 46%.
calculation of D
-difficulty of the item is computed for each group separately, and these are labeled pT and pB (T for top, B for bottom)
-D = pT − pB
example:
pT = 0.80 means that
pB = 0.30 means that
-80% of the examinees in the top group answered the item correctly
-30% of the examinees in the bottom group answered the item correctly
Items with D values over ___ are acceptable
Items with D values below should be reviewed
**these are general rules there are exceptions
-0.30
*larger is better
-0.30
Most indexes of item discrimination are biased in favor of items with ______difficulty levels
intermediate
Items that all examinees either pass or fail (i.e., p values of either 0.00 or 1.0) do not provide any information and their D values will always be _____
zero
If half of the examinees correctly answered an item and half failed (i.e., p value of 0.50) then it is possible for the
item’s D value to be ___
1.0
*This does not mean that all items with p values of 0.50 will have D values of 1.0; just that the item can conceivably have a D value of 1.0
As a result the relationship between p
and D, items that have excellent discrimination power (i.e., D values of 0.40 and above) will necessarily have p
values between___and___
0.20 and 0.80
In testing situations where it is desirable to have either very easy or very difficult items, D values can be expected to be ____than those normally desired.
lower
___D values often indicate problems, but these guidelines should be applied in a flexible manner.
Low
-The interpretation of discrimination indexes is also complicated on ___tests
-It is normal for traditional item discrimination indexes to __-estimate an item’s true measurement characteristics
mastery
under
approach for item discrimination on mastery tests
limitation of this approach
-administer the test to two groups: one group that has
received instruction and one that has not.
D = p instruction − p no instruction
-limited due to difficulty locating group that has not received instruction on relevant material
-another approach for item discrimination on mastery tests
-limitations
-administering the test to the same sample twice, once before instruction and once after instruction.
D = p posttest − p pretest
-Requires that the test be used as both a pretest and posttest, may involve carryover effects
-another approach for item discrimination on mastery tests
-Use item difficulty values based on the test takers who reached the mastery cut-off score and those who did not reach mastery
D = pmastery − pnonmastery
Item-Total Correlation Coefficients
Large item-total correlation suggests
-Item discrimination can also be examined by correlating performance on the items (scored as either 0 or 1) with the total test score. calculated using the point-biserial correlation
-that an item is measuring the same construct as the overall test measures
Item Difficulty and Discrimination on Speed Tests:
-Item performance depends largely on the
-Measures of item difficulty and discrimination will reflect _______________, rather than the item’s actual difficulty level or ability to discriminate.
speed of performance
-the location of the item in the test
Allows you to examine how many examinees in the top and bottom groups selected each option on a multiple choice item.
distractor analysis
effective distractor attracts more examinees in the ___group and demonstrates ____discrimination
bottom
negative
qualitative item analysis (4 tips)
-Set the test aside and review a few days later
-Have a colleague review the test
-Have examinees provide feedback after taking the test
-use both quantitative and qualitative approaches
Theory of mental measurement that holds that the responses to items are accounted for by latent traits
Item Response Theory
_____is an ability or characteristic that is inferred based on theories of behavior, as well as empirical evidence, but cannot be assessed directly.
latent trait
Central to IRT is a complex mathematical model that describes
how examinees at different levels of ability will respond to individual test items.
Item Characteristic Curves (ICC):
graph with _____reflected on the horizontal axis and the ________ reflected on the vertical axis
ability
probability of a correct response
T or F
Each item has its own specific ICC
T
ICCs incorporate information about(2)
item’s difficulty and discrimination ability
ICC:
point halfway between the lower and upper asymptotes is referred to as the ____
inflection point
inflection point represents what?
difficulty of the item (b parameter)
ICC:
Discrimination (i.e., the a parameter) is reflected by
the slope of the ICC at the inflection point
ICCs with ____slopes demonstrate better
discrimination than those with ___slopes
steeper
gentler
the "simplest model" also referred to as a one-parameter IRT model
Rasch IRT Model
Rasch IRT Model assumes that items differ in only one parameter, which parameter?
difficulty (b parameter)
Two-Parameter IRT Model assumes that items differ in both (2)
difficulty and discrimination
which one, one parameter IRT or two-parameter IRT model, better reflects real life test development applications
Two-Parameter IRT Model
Three-parameter model assumes that
even if the respondent essentially has no “ability,” there is still a chance he or she may answer the item correctly simply by chance
In both the one- and two-parameter IRT models, the ICCs asymptote toward a probability of ___
this assumes that essentially a ____percent possibility of answering the items by chance
zero
zero
Item difficulty (i.e., p) and item discrimination (i.e., D) are based on _____theory
classical test (meaning both are sample dependent)
In IRT, the parameters of items (e.g., difficulty and discrimination) are sample-_____
free/independent
(4) Special Applications of IRT
• Computer Adaptive Testing (CAT)
• Scores based on IRT (see Chapter 3)
• Reliability (see Chapter 4)
• Detecting biased items