Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
131 Cards in this Set
- Front
- Back
- 3rd side (hint)
Theory of Measurement Error |
Human error in measurement |
|
|
Classical Test Theory |
True score as opposed to the observed score - the difference between the two results due to measurement error |
|
|
Domain sampling method |
Ensuring enough questions in a test cover the relevant domain without sacrificing reliability |
|
|
Item Response Theory |
Use of computer algorithms to adjust test to taker's ability |
|
|
Test Retest reliability |
Test at two different times and correlate the two results |
|
|
Parallel Forms reliability |
Compare two equivalent forms of a test with the same attribute and correlate the two results |
|
|
Split Half reliability |
Take test results and divide them in half and correlate the two halves |
|
|
Spearman-Brown formula |
Allows to estimate the correlation between each half of a test if each half had been the whole length |
|
|
Internal consistency |
Do all the items in a scale correlate with overall test scores |
|
|
KR20 formula |
Test for internal consistency when items are dichotomous |
|
|
Cronbachs alpha |
Testing internal consistency when items are not dichotomous |
|
|
Observer reliability |
Counting the number of times a behaviour is observed and correlate with other observer data |
|
|
Kappa statistic |
The best method to assess level of agreement between observers on a nominal scale |
|
|
Discriminability analysis |
Each scale item should correlate with the overall scale |
|
|
Correction for attenuation |
Where a relationship between two variables may exist but cannot be determined because of Measurement Error |
|
|
Construct Related Validity |
Assembling evidence about what a test/construct means |
|
|
Content Related Reliability |
Does a test cover everything it's meant to measure? |
|
|
Criterion related validity |
How well a test correlates with a criterion |
|
|
Construct underrepresentation |
Not enough material in a test to capture important components of a construct |
|
|
Construct irrelevant variance |
Scores influenced by factors irrelevant to the construct |
|
|
Concurrent validity evidence |
Test and criterion measured at the same time |
|
|
Predictive validity evidence |
How well will someone perform later |
|
|
Validity coefficient |
Relationship between a test and a criterion (rarely higher than 0.6) |
|
|
How is construct validity evidence established |
Series of activities where a researcher defines a construct and a method to measure it |
|
|
Convergent validity |
Do all measures of one construct accurately measure it? |
|
|
Divergent validity |
Is a test unique and measure a construct more easily or accurately than other tests? |
|
|
Dichotomous item format |
Two responses (eg true or false) |
|
|
Polytomous item format |
Multiple choice |
|
|
Distractor |
Item on a multiple choice test that is incorrect Ideal number = 3 |
|
|
Likert format |
Considers the degree of agreement with or without a neutral option |
|
|
Category item format |
On a ten point scale, but often context dependent |
|
|
Checklist |
Checking off traits relevant to something or someone such as describing yourself |
|
|
Q-Sort |
Checklist whereby Ps sort statements into piles (9=exact, 1=not at all) |
|
|
Expectancy Effects |
Administrator biases that affect test takers' results |
|
|
Reinforcing Responses |
Responses from an administrator that bias test results |
|
|
Subject variables |
The state of the test taker |
|
|
Stereotype Threat |
Impact on test scores due to anxiety about perpetuating a stereotype |
|
|
Expectancy effect |
Administrator biases affecting test scores/performance |
|
|
Structured interview |
Follows strict questions and answers |
|
|
Unstandardized interview |
Relaxed and not strict with questions and answers |
|
|
Directive interview |
Interviewer takes lead |
|
|
Nondirective interview |
Interviewee takes lead |
|
|
Selective interview |
Meant for employment |
|
|
Diagnostic interview |
Intended to find out thoughts and emotions |
|
|
Interpersonal influence |
Social facilitation |
Acting like those around us (if interviewer is stressed, interviewee is stressed too) |
|
Interpersonal influence |
How much can one influence another |
|
|
Interpersonal attraction |
How much respect people have for each other |
|
|
Verbatim playback |
Repeating what a participant has said to elicit more responses |
|
|
Restatement |
Paraphrasing what a participant has said |
|
|
Summarizing |
Summing up many participants responses |
|
|
Clarification |
Paraphrasing what a participant has said to clarify |
|
|
Evaluation interview |
Statements that point to discrepancies between what a person says and does etc |
|
|
Structured clinical interview |
Reliable set of questions in a particular order |
|
|
Case history interview |
Getting a complete history of an individual |
|
|
Mental status exam |
Intended to diagnose psychosis or brain damage, and looks at how a person is behaving or thinking |
|
|
Active listening |
Intending to give an understanding response |
|
|
Halo effect |
Tendency to judge based on first impressions |
|
|
General standoutishness |
When something stands out and colours all areas of assessment |
|
|
Good estimate of test scores reliability with (structured) interview ratings |
0.4 |
|
|
Level one response |
No relation to interviewer response |
|
|
Level 2 response |
Superficial awareness of interviewer response |
|
|
Level 3 response |
Minimum level response to interviewer that shows awareness |
|
|
Level 4--5 response |
Provides empathy, interviewee adds significantly to the conversation |
|
|
Information processing approach |
Asks how we solve problems and learn from them |
|
|
Cognitive approach |
How do humans adapt to real world demands |
|
|
Psychometric approach |
Focuses on the elemental structure of a test (oldest way of studying intelligence) |
|
|
Age differentiation |
Mental ability of a child in terms of his/her completion of the tasks designed for an average child of a particular age |
|
|
General mental ability |
Measure only the total product of the various separate and distinct elements of intelligence |
|
|
"g" |
A measure of intelligence based on various factors and half the variance in mental ability scores |
|
|
Gf-cf intelligence theory |
Says that intelligence is made of two factors: fluid and crystallized intelligence |
|
|
Three levels if intellectual disability from 1905 Binet Simon scale |
Moron, imbecile, and idiot |
|
|
Age scale |
Items are grouped according to age vs level of difficulty |
|
|
Three abilities stemming from modern Binet scale's g |
Crystallized abilities, fluid-analytic and short term memory |
|
|
What is crystallized abilities divided into in the modern Binet |
Verbal and nonverbal reasoning |
|
|
Basal |
Minimum criterion number of correct responses |
|
|
Ceiling |
Certain number of incorrect responses indiciating items are too difficult |
|
|
Four types of evidence supporting validity in 2003 edition of Binet |
Content, construct, empirical item analyses and criterion related |
|
|
How is the full IQ score determined in the 2003 Binet |
Based on all ten tests (half verbal. Half nonverbal) |
|
|
How does the WAIS conceptualize intelligence |
Capacity to act purposefully and adapt to the environment |
|
|
Three main differences between Binet and WAIS |
Meant for adults; point scale usage; nonverbal scale usage |
|
|
Four factor scores in WAIS |
Verbal comprehension, perceptual reasoning, working memory, and processing speed |
|
|
Three subscales in verbal comprehension |
Similarities, vocabulary, information |
|
|
Three subscales in perceptual reasoning |
Block design, matrix reasoning, visual puzzle |
|
|
Two subscales in working memory |
Digit span and arithmetic |
|
|
Two subscales in processing speed |
Symbol search and coding |
|
|
Index comparisons |
Allows for observing multiple scores for the WAIS and discrepancies between them and helps for diagnostic purposes |
|
|
Pattern analysis |
Evaluating relatively large differences between subtest scaled scores on the WAIS and drawing conclusions based on mental wellbeing (should be done cautiously) |
|
|
WAIS scaled scores mean and SD |
Mean: 10 SD: 3 |
|
|
Three explanations for IQ Gap |
Biological differences; socioeconomic status; tests are culturally biased |
|
|
Differential validity |
Validity for groups has different meanings for each |
|
|
Differential item functioning analysis |
Tries to identify items specifically biased against any group |
|
|
Larry P vs Wilson Riles |
Rules that tests are racially biased against blacks and are discriminatory impact on them |
|
|
parents in action on special education vs Hannon |
Rules tests are not discriminatory because they don't predict inaccurately |
|
|
Chitling test |
Tried to show that African American kids knew different info than whites (no predictive validity) |
|
|
BITCH |
Asks respondents to ID words relevant to African American culture and find Whites score lower (but no validity evidence and small sample size) |
|
|
SOMPA |
Believes all groups have same potential, looks at medical , social and pluralistic aspects (poor correlation with achievement) |
|
|
Qualified individualism |
We should select the best qualified people indifferent of race |
|
|
Unqualified individualism |
Selecting the best qualified people and account race if it helps find them |
|
|
Quotas |
Explicitly recognize race and select best qualified people based on how representative they are in the population |
|
|
Two purposes of psychological tests |
Research and clinical (applied) |
|
|
Test items |
Stimuli that can be scored and evaluated |
|
|
Overt test response |
Either right or wrong |
|
|
Covert test response |
Meant for projective tests |
|
|
What do aptitude tests measure |
Potential |
|
|
What do achievement tests measure |
What has already been learned |
|
|
What was the first psychological test |
Woodworth (for army recruits) |
|
|
Inferential statistics |
Taking a sample of a population and drawing conclusions from it |
|
|
Type I error |
Reporting an effect when none is there |
|
|
Type II error |
Not reporting an effect when one is there |
|
|
Nominal scale |
Categories of things |
|
|
Ordinal scales |
Ranks with magnitude but no absolute zero or equal intervals |
|
|
Interval scale |
Magnitude and equal intervals but no absolute zero |
|
|
Ratio scale |
Have magnitude, equal intervals and absolute zero |
|
|
Percentile rank |
Score where a certain percentage of the distribution falls below |
|
|
Percentile point |
Specific scores in the distribution below which a defined percentage of scores falls |
|
|
Normed/Criterion referenced tests |
Put up against people your own age or certain criteria, respectively |
|
|
Regression line |
The best possible fitting line through a set of points in a scatterplot |
|
|
Goal of regression |
Predict values of one variable given knowledge of another and reduce residuals |
|
|
Regression coefficient |
How much Y changes given each unit change of X |
|
|
Covariance |
How much two variables vary together |
|
|
Coefficient of alienation |
The remaining percentage left over when R2 is accounted for |
|
|
Intercept |
The value of Y when X is zero |
|
|
Pearson's product moment |
Two continuous variables |
|
|
Spearmans Rho |
Ranked ordinal variables |
|
|
Biserial |
Continuous and an artificial dichotomous variable |
|
|
Point biserial |
Continuous and true continuous variable |
|
|
Phi coefficient |
Two dichotomous variables one of which must be true |
|
|
Tetrachoric correlation |
Two artificial dichotomous variables |
|
|
Standard error of estimate |
Standard deviation of the residuals |
|
|
Shrinkage |
Amount of decrease in predictive power when a regression equation is used for a different sample than the one with which it was calculated |
|
|
Restricted range |
Loss of ability to detect a relationship when one or both variables are highly restricted in a range of variability |
|