Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
19 Cards in this Set
- Front
- Back
What are some reasons for Item Analysis?
|
-Provide validity evidence (e.g., quality of scores)
-Identify examinees’ misconceptions or misunderstandings Content analysis of responses Feedback to examinees Feedback to teachers -Identify flaws in test items Inter-item correlation Item-total correlation Alpha if item eliminated Analysis of distracters Item difficulty and discrimination -Revision of assessment tasks/selecting best items |
|
What is reliability?
|
Reliability: The amount of measurement error associated with the scores.
The extent to which all the items are consistent with each other. Inter-item correlation Item-total correlation Alpha if item deleted |
|
What is speededness?
|
Speededness: The percentage of students who respond to the last item on the assessment.
If the test is not speeded, 90% of the students should be able to finish the test. |
|
What is Analysis of the Distracters?
|
The extent to which each of the item options in your question are working properly.
|
|
What is difficulty?
|
The proportion of students who get the item correct.
|
|
What is Discrimination?
|
The extent to which items distinguish between those who have acquired the content and those who haven’t.
|
|
What is reliability analysis and how do we do this?
|
-It is important to assess the reliability of the entire scale.
-It is also important to assess what the reliability of the scale would be if each of the items were deleted. -We do this in three ways: Inter-item correlation Item-total correlation Alpha if item deleted |
|
What is inter item correlation?
|
-Matrix displaying correlation of each item with every other item.
-Provides information about internal consistency of test. -Each item should be correlated highly with other items assessing the same construct. -Items that do not correlate highly with other items assessing the same construct can be dropped without reducing the test’s reliability. |
|
What is Item Total Correlation?
|
-The correlation between the score on each item and the total score obtained on the entire instrument.
-Interpretation of the item-total correlation illustrates two things: -That the item in question is measuring the same construct as the test. -That the item is successfully discriminating between those who perform well and those who perform poorly. -If the item is assessing the same construct as others in the scale, the correlation will be high. -This is somewhat similar to the Discrimination Index, which we will discuss later. |
|
Alpha if Item deleted?
|
-The Cronbach’s alpha of the scale if each individual item was deleted from the scale.
-If the item is assessing the same construct as others in the scale, the reliability will not change (or change very little). -Again, this statistic gives an indication of the internal consistency of the items on the assessment. |
|
When determining Item Difficulty it is important to remember?
|
-You want items to discriminate between people.
-Thus, you must choose items of a moderate level of difficulty. -Item difficulty is most commonly measured by calculating the percentage of test-takers who answer the item correctly. -Difficulty of the item is ideal if it falls halfway between the mean chance score and a perfect score. |
|
Determining Item Difficulty #2?
|
-Generally, items with p values of 0.5 yield test scores with the most variation.
-Often, items with difficulty levels between 0 – 0.2 and 0.8 – 1.0 are discarded. Too hard/easy = low reliability = less validity -You must also take the chance of “guessing” into account. |
|
Determining Item Difficulty #3?
|
-For norm-referenced tests, we chose those items with moderate levels of difficulty.
-For criterion-referenced tests, it is not appropriate to choose items based on their difficulty. -Instead, our focus is on adequately sampling the domain. -If 100% of people get a question correct, who cares? We are interested in how well they do and how well they know the content. -For this reason, criterion-referenced tests tend to be easier. |
|
Important note about difficulty?
|
Although p-values provide an indication of how difficult test items are, they tell us very little about the item’s usefulness in measuring the test’s content.
|
|
What are the steps to determining Item Descrimination?
|
-For each item, high achievers should answer correctly and low achievers should answer incorrectly.
-Score the tests, compute total scores, sort from High to Low -Determine upper and lower groups e.g., Kelly’s 27% rule -Calculate the percentage of test-takers passing each item in both groups. -Calculate the D-index for each item. -The difference between the percentages of test-takers passing each item D = U - L -1 < D < +1 |
|
What is the D-Index Value Reference?
|
D-index value reference:
D > .40: satisfactory/good .30 < D < .39: require little or no revision .20 < D < .29: marginal and need revision D < .19: eliminate or completely revise |
|
What is NR?
|
NR
Within each topical area of the test blueprint, select those items with: .16 < p < .84 if the test represents a single ability .40 < p < .60 if the test represents different abilities p-value should be larger if guessing is a factor D > +.30 D-index is secondary to content; p-value is secondary to D-index |
|
What is CR?
|
CR
Don’t select items on the basis of their p-values. Examine p-value to see if it is signaling a poorly written item. D > 0 |
|
What is the Analysis of Distractors?
|
-With MC tests, there is usually one correct answer and a few distracters. A lot can be learned from analyzing the frequency with which test-takers choose distracters.
-Consider that perfect MC questions should have 2 distinctive features: -People who have knowledge pick the correct answer. -People who don’t know guess among the possible responses. -Thus, each distracter should be equally popular. -The correct answer will also be chosen by those who don’t have knowledge: # answering correctly = those who know + some random amount of people |