Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Psychometric Theory Final Exam

Psychometric Theory Final Exam

by hhogland, Apr. 2016

Subjects: Classical Testing Theory, Factor Analysis, Short Forms, IRT, Generalizability

Favorite

Add to folder

Flag

Related Essays

Charater Reliability Paper
Reliability, or precision, can be defined as the degree to which a measurement is able to be reproduced with nearly the same value when evaluated several tim...
PCA: A Qualitative Study
Once four factors-solution has been decided, the researcher did rerun the EFA with the PCA. When examining the interAs examined the variance explained by eac...
Low, Medium Or High-Stakes Examination
Multiple choice items are not prone to true or false question testing and can be written to test specific aspects of a learning outcome. This significantly e...
High Stakes Assessment
There is also question as to how these tests maintain their validity and reliability. Carneiro, Crawford, and Goodman (2007) suggest that these tests encoura...
Using Descriptive Statistics And Pearson 's Correlation Coefficient
It was useful since it allowed finding the most common responses from the participants. Factor analysis creates clusters based on both categorical and contin...
SSAIS-R Psychometric Test Analysis
Introduction Below follows a four-fold discussion and analysis of the SSAIS-R psychometric test. It begins with an in-depth description of the test, linking ...
The Pros And Cons Of Standardized Testing
Along with these teaching methods come distinct test making methods. One of the biggest test making methods questions is whether or not multiple-choice measu...
Essay On The Effects Of Standardized Testing In Education
He includes chapter that are about, test legitimacy, test consistency, unimportant factors that influence performance on intellectual tests, and grading and ...
American Test Anxieties
The vast majority of students suffer from the aggravating feeling of uncertainty before an exam. In fact, according to the American Test Anxieties Associatio...
Reliability Statistics: Cronbach's Alpha N Of Items?
Total 19 100.0 a. List wise deletion based on all variables in the procedure. Reliability Statistics Cronbach 's Alpha N of Items .887 34 The questionn...

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/77

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

77 Cards in this Set

Front
Back

	issue with guessing	it leads to problems in understandingwhat one’s true test score - especially on achievement tests
	Abbott's formula for blind guessing	R(correctresponses) W (wrong responses) - K (number of alternatives)237
	To overcome the influence of blind guessing - one should advise examinees to	attempt every question
	Items that are clear in multiple choice formats may be confusing in	short answer formats
	Accordingto Ebel abetter way to increase test reliability is	to add more items
	The bestway to calculate reliability for speeded tests is to	do a split-half reliability on the test
	Halo Effect	rater’stendency to perceive an individual who is high (or low) in one areas is alsohigh (or low) in other areas
	general-impression model	tendency of rater to allow overall impressions of an individual influence judgment of a person’s performance(e.g.person may rate reporter as “impressive” and thus, also rate his/her speech as strong as well)
	Salient Dimension model	When the rating of one quality affects the rating of another independent quality (e.g.people rated as attractive are also rated as more honest)
	SimpsonParadox	aggregatingdata can change the meaning of the data - can obscure the conclusions becauseof a third variable/span>6jG
	In terms of minority hiring - minorities applied to two levels of positionsclerical and executive - overall hiring rats found that only 11% (110/1010) ofminority group were hired as compared to 14%(85/600) of majority group. What is this scenario an example of?	the Simpson Paradox
	Thereis a debate that whether our clinical judgment is superior to	mechanical judgment
	mechanical judgment	statistical predictions or predictions based on some type of quantative indice
	Marital relationship satisfaction wasdetermined based on higher sex versus argument ratios - people tend to raterelationships higher if have more sex and less fights This is an example of what kind of mechanical decision making?	crude
	Mechanicalor quantitative prediction can only work when	people highlight what variables to examine to determine prediction
	People are not as good, in terms of prediction, as	integrating the data in unbiased ways
	Our belief in prediction if reinforced by the	isolated incidents we can access
	Factor Analysis	a statistical tool that is used to mathematically determine which items are associated with various latent constructs
	Factor analysis requires that one come up with	number of items
	Steps in factor analysis	1. sample items on 200-500 subjects 2. input how the sample rated each item 3. run factor analysis and then look at the pattern of where items load and then name the factor
	When doing item development for factor analysis, you need to have ___________ items because they give you greater ability to tap into multiple aspects of the construct	more
	Facets	defined-homogenousitem clusters that directly map onto the larger order factors
	Dichotomous item response formats cannot be used for factor analysis because	it can cause a serious disturbance in the correlation matrix
	When utilizing factor analysis, more response items mean generate a greater amount of	variance
	For well defined factors, you can use a sample size of _____________ for factor analysis	100-200
	If factors are not well defined you may need a sample size of up to _________ for factor analysis	500
	4 Reasons for Conducting Factor Analysis	1. Developing and Identifying Hierarchical Factor Structure 2. Improving Psychometric Properties of a Test 3. Developing Items that Discriminate between Samples 4. Developing more unique items
	All tests with sound items should have a strong	internal consistency
	Factor analysis can help developers determine items to remove, revise or add in order to improve	internal consistency
	2 Primary Objections to Short Form Development	1. Rigorous and comprehensive evaluation is crucial and short form cannot give the level of information that is required for an appropriate assessment 2. Short forms are often developed without careful and thorough examination of the new form's validity
	2 General Problems for Short Forms	1. Assumption that all the reliability and validity of the long form automatically applies to the abbreviated form 2 Assumption that the new shorter measure requires less validity evidence
	7 Problems in Regard to Empirical Evidence for Short Forms	1. Researchers found that if large measure does not have good validity, neither will the short one! 2 Found that by reducing the items the content coverage may be compromised- very few short form designers performed content domain checks 3. Found significant reduction in reliability coefficients 4. found that many times researchers do not run another factor analysis on the short form to see if the same factor structure is present. 5. Need to administer short form to an independent sample to determine validity- not use the sample that long form was developed on 6. need to use the short form to classify clinical populations and compare if it is as accurate as the long form 7. need to establish if there is genuine time and money savings with a short form
	Item Analysis	general term for a set of methods used to evaluate test items
	2 Types of Item Analysis	Item Difficulty vs. Item Discriminability
	Item Difficulty	defined by the number of people who get a particular item correct
	Item difficulty should usually fall between	.3 and .7
	When developing item difficulty, you need to consider whom	you are testing (like medical students vs. disabled students)
	test floor	sufficient amount of easy items
	test ceiling	sufficient amount of hard items
	ItemDiscriminability	determineswhether the people who have done well on a particular item have also done wellon the entire test
	Extremegroup method for Item Discriminability	comparespeople who have done very well with those who have done very poorly on a test
	discrimination index in extreme group method	proportion of people in each group
	Item Difficulty Formula	U + M + L
	Item Discrimination Formula	U - L
	2 Methods of Item Discriminability	1) Extreme Group Method 2) Point Biserial Method
	Point Biserial Method for Item Discriminability	find the correlation between the performance on the item and compare it with the enire test
	Item Response Theory (IRT) is a collection of mathematical models and statistical models that do these 3 things:	1. analyze items and scales 2. measure psychological constructs 3. compare individuals on psychological constructs
	The basic unit of IRT is	item response function
	item response function is a mathematical function describing	therelation between where an individuals falls on the continuum of a givenconstruct such as depression and the probability that the he/she will give aparticular response to a scale item designed to measure that construct
	In IRT, a construct is called a	latent variable
	in terms of item difficulty, the higher the number, the	easier the question
	Point biserial ranges from	-1 to +1
	A positive point biserial tells us that	the item discrimnates well because those that scored higher on the test also got the questions correct
	The closer a point biserial is to +1, the more _______________________ it has	discrimination power
	Discrimination power means that	it does well at discriminating between upper and lower ranges
	A negative point biserial generally indicates that people in the higher scoring ranges got the item _________, as compared to those in the lower scoring range.	wrong,
	A negative point biserial means that there is something wrong with	your question, but we don't know what.
	Classical Testing Theory or CTT is limited by	only 2 sources of error- random and systematic
	True Score Model from Classical Testing Theory	X (Observed Score) = T (True Score) + E (Error)
	Random Error	fluctuations in the measurement based purely on chance
	Systematic Error	error that affects a score because some particular characteristic of the person or the test that has nothing to do with the construct being measured
	CTT recognizes only two sources of variance, and cannot adequately estimate	individual sources of error influencing a measurement
	Generalizability Theory acknowledges that	multiple factors may affect the error associated with measurement of one’s true score
	Generalizability Theory allows researchers to estimate the total variance or error in terms of	individualfactors that vary in terms of the assessment, setting, time, items andraters
	Dependability	is the testers score dependable across a myriad of conditions
	Reliabilityis dependent on	theinferences (generalizations) that the investigator wishes to make with the datafrom the measurement
	2 Types of Error Analyses	1. G-Study 2. D-Study
	G- Study (Generalizability Study)	Toprovide as much information as possible about the sources of variation in themeasurement
	D-Study	usesG-Study information to evaluate the effectiveness of alternative designs forminimizing error and maximizing reliability.
	Generalizabilitycoefficient	A reliable measure is one where the observed value closely estimates the expectedscore over all acceptable observations
	dependability coefficient	how dependable are the measures from one judge to the next
	Reliability Formula	X= T + E Variance of T / Variance of X or Variance of T / Var T + Var E
	Item difficulty is synonymous with severity. Which means the	more severe the person's diagnosis is, the more likely they will be to endorse that item.
	*Item difficulty is indicated by the curve that is furthest away	from the Y axis
	Item discrimination is determined by the steepness of	the slope
	Generalizability coeffcients are from	.8-1.0- good generalizability .6-.8- marginal generalizability <.6- poor generalizability
	The biggest advantage of iIRT over CCT is that	you can map differential severity patterns for each item. You can look at individual items and look at differential scoring patterns to determine levels of severity as well as discriminability, independent of test bias, which can prevent a clinician from overpathologizing.

Share This Flashcard Set

Set the Language

Psychometric Theory Final Exam

Add to Folders

Upgrade to Cram Premium

Related Essays

Card Range To Study

77 Cards in this Set