Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Week 4 - Item analysis & test revision

Week 4 - Item Analysis & Test Revision

by sophievaux, Mar. 2014

Subjects: PSY3041

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/23

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

23 Cards in this Set

Front
Back

	Item analysis	Procedures used to assess and index of the item's: - Difficulty - Reliability - Validity - Discriminiation - E.g., you have 100 pool items and you have administered this test to tryout to 100 uni students. Now you are at the stage where you need to analyse each item.
	The Item-Difficulty Index	- obtained by calculating the proportion of the total number of testtakers that answered the item correctly - The larger the item difficulty index, the easier the item is
	Item-Endorsement Index	- Measure of the proportion of testtakers that agreed with an item (e.g., in a personality test)
	Average item difficulty	- Calculated by adding all the deparate item difficulties and dividing this sum by the number of items
	Optimum item dificulty	- Must take into account the probability of guessing - Optimum item difficulty is mid point between the 1.00 and the chance of success portion - e.g., 5-choice MC item: 0.2 + 1.00 / 2 = 0.6 (optimum item difficulty)
	Item-Reliability Index	- Provides an indication of the internal consistency of the test - Measures using factor analysis: do all the items "tap in to" the same factor?
	Item-Validity Index	- Provides an indication as to whether the item is measuring what it purports to measure - Higher the item-validity index, the higher the item's criterion-related validity - Complex formula (refer to TB pg 265)
	The Item-Discrimination Index	- Indication of how accurately an item discriminated high scorers from low scorers - Symbolised by lower case 'd' in italics - Compared performance on an item with performance in the upper and lower regions of the distribution - Optimal boundary lines for upper and lower regions = 27% of the distribution scores, given it is a normal distribution - Any percenage between 25 and 33% will yield similar results
	Analysis of item alternatives	- Not stastical, more of an eyeballing technique - Looking at the response patterns to see proportion of high scorers and low scorers that got the item correct - If more low scorers than high scorers got the item correct, this indicates a bad item - If a lot of high scorers chose an alternative, this would indicate something is wrong with the test item
	Item characteristic curve	A graphic representation of item difficulty and discrimination - Ability plotted on the horizontal axis and probability of correct respnonse is plotted on the vertical axis
	Other considerations for item analysis:	(1) Guessing - Guesses can be based on a vague idea (i.e., some knowledge), therefore item cannot be completely discarded - Should ommitted items be scored as wrong? - Some testtakers are luckier than others when it comes to guessing (2) Item fairness - An item can be biased (e.g, one that favours people with experience studying history in a psychology test) - Item characteristc curves can be used to identify biased items (3) Speed tests: - Can yeild misleading and uninterpretable results - Items at the end of the test are more likely to appear difficult, probably because not as many people made it to the end - If speed is not an important ability for the variable being assessed then ample time should be given for testtakers to complete the test
	Qualitative Item Analysis	- Qualitiative methods are techniques of data generation and analysis that rely primarily on verbal rather than mathematical or statistical procedures (e.g., getting testtakers tobe verbal when completing the test and commenting on difficulty/ their attitude towards the test item)
	Think aloud test administration	When testtakers commentate while completing the test
	Sensitivity review	A study of test items that assesses the fairness and sensitivity of a test. - Usually done using expert panels
	Test revision	- Moulding the test into its final form - One approach is to characterize each item according to its strengths and weaknesses e.g., some items are reliability, but lack criterion-related validity - Test developers must balance various strengths and weaknesses across items
	Test revision of an existing test	- No hard and fast rules as to when a test should be revised Should be revised when: - The materials look dated - Verbal content of test is dated - Test norms are no longer adequate - The reliability and validity of the test can be significnatly improved by revision
	Cross-validation	Revalidation of a test on a sample of testtakers other than hose on whom test performance was originally found to be a valid predictor of some criterion
	Validity shrinkage	The inevitable decrease in item validity the occurs due to cross validation
	Co-validation a.k.a. co-norming	Validation processes conducted on two or more tests using the same sample
	Quality assurance during test revision	- Not all test desvelopers hold a doctoral degree - Publishers can evaluate potential examiners by using a quiz etc.
	Anchor protocol	A test protocol scored by a highly authoritative scorer that is designed as a model of scoring and a mechanism for resolving scoring discprepancies.
	Scoring drift	A discrepancy between scoring in an anchor protocol and the scoring of another protocol
	Item Response Theory (IRT)	*Research this

Share This Flashcard Set

Set the Language

Week 4 - Item Analysis & Test Revision

Add to Folders

Upgrade to Cram Premium

Card Range To Study

23 Cards in this Set