• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/80

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

80 Cards in this Set

  • Front
  • Back
Validity
The degree to which a test actually measures what it purports to measure.
What are the three main types of validity evidence?
1. Construct-related validity
2. Criterion-related validity
3. Content-related validity
What prerequisites exist for validity?
MUST have reliability (otherwise you're measuring error)
Define Face Validity. How does it differ from other aspects of validity?
The appearance that a measure has validity.
A superficial judgment from the test-taker
(inferential and NOT statistical)
Define Content Validity. How is it measured?
The degree to which a test or measure adequately represents the universe of items it's designed to measure.
Logistical and not statistical.
Measured by expert panel (CVR)
How do construct underrepresentation and construct-irrelevant variance relate to construct validity?
Construct underrepresentation- failure to capture important components (APUSH test without questions about Civil War)
Construct-irrelevant- scores are influenced by irrelevant factors (APUSH test with questions on Glorious Revolution)
What is the content validity ratio and how is it calculated?
CVR- when >50% of panelists (normally) consider an item "essential" it has some content validity.
CVR = (Ne-(n/2))/(n/2)
What is a criterion? What is criterion-related validity?
Criterion- the standard against which a test is compared. (the criterion for SAT is GPA in college)
Criterion-related validity- how well scores correspond with a particular criterion.
Name and define the three subtypes of criterion-related validity.
1. Predictive- can the test predict a criterion obtained at a later time? (SAT predictor, GPA criterion)
2. Concurrent- degree to which test scores are related to criterion measured at the same time. (Job Samples, Secret Shopper)
3. Postdictive- accuracy with which a test score predicts a previously obtained criterion. (test of antisocial behavior to predict past delinquent acts)
What is the validity coefficient? What is the meaning of a squared validity coefficient?
Relationship between test and criterion. The extent to which the test is valid for making statements about the criterion.
Validity Coefficient Squared- Percentage of variation in criterion we can expect to know in advance because of our knowledge of test scores.
What is incremental validity?
Does the test contribute over and above the other predictors?
What is a construct? What is construct-related validity?
Construct- something built by mental synthesis. encompasses all other types of validity.
Construct-related validity- the degree to which a set of measurement operations measures hypothesized constructs.
What are the two types of evidence in construct-related validity?
1. Convergent evidence- demonstration of similarity. Measures of same construct narrow in on the same thing. Obtained by showing that the test measures the same thing and demonstrating specific relationships one can expect if the test is doing it's job.
2. Discriminant (divergent) evidence- demonstration of uniqueness. Has low correlations with measures of unrelated constructs. Doesn't represent a construct other than other than its devised purpose.
What is the relationship between reliability and validity?
See figure on p.154.
Can have a reliable test that is not valid but not a valid test that is not reliable.
Which two types of validity are logistical and not statistical? Why?
Face Validity and Content-related Validity.
Which type of validity has been referred to as the "mother of all validities", or "the big daddy" and why?
Construct-related Validity. It encompasses the other forms of validity.
What role does the relationship between examiner and test taker play?
The atmosphere created by the examiner (enhanced rapport condition) greatly influence test scores.
Familiarity can also influence test scores.
What is the relationship between test examiner race and intelligence scores?
People believe it makes a great difference, but it does not.
Why would examiner race effects be smaller on IQ tests than on other psychological tests?
The IQ test is extremely Standardized, so it does not make a significant difference.
What is the standard for test takers who are fluent in two languages?
Give the test in their "best" language.
What are expectancy effects and who is associated with them?
Rosenthal
Data can be affected by what an experimenter expects to find.
A review of many studies showed that expectancy effects exist in _____ situations.
some, but not all (situations)
What types of situations might require the examiner to deviate from standardized testing procedures?
In extreme cases (kid running around, blind/deaf, disorders...)
What advantages and disadvantages were mentioned in lecture and text regarding computer administrated tests?
Advantages: excellence of standardization, individually tailored sequential administration, test taker not rushed, control of bias...
Disadvantages: Hard to detect errors is validation is adequate, lack of personal interaction...
What subject variables can impact testing?
Test anxiety (worry, emotionality, lack of self-confidence), illness, stress, fatigue...
What are the three major problems in behavioral observation studies?
Reactivity, Drift, Expectancies
What is reactivity? How does performance change when people are not being observed or checked?
Studies have shown that accuracy and interrater agreement decrease when observers believe their work is not being checked.
What is drift? How does it relate to the contrast effect? How can drift be addressed?
Observers tend to drift from the strict rules followed in training and adopt idiosyncratic definitions of behavior.
Contrast effect: the tendency to rate the same behavior differently when observations are repeated in the same context?
Observers should be periodically trained.
How well do people do in detecting lies?
Worse than chance.
What is the halo effect?
Ascribing positive (or negative) attributes independently of the observed behavior. (Angels- we describe them as good even though there could be bad angels)
What does a good interviewer know how to do?
Be warm, open, concerned, involved, committed and interested. Know how to keep the interaction going.
How are interviews similar to tests?
They gather data, make predictions, ask questions...
Define interpersonal influence and interpersonal attraction. How are they interrelated?
Interpersonal Influence- the degree to which one person can influence another
Interpersonal Attraction- the degree to which people share a feeling of understanding, mutual respect, similarity etc...
What types of statements should be avoided to elicit as much information as possible?
Judgmental Statements- evaluating thoughts, feelings or actions
Evaluative statements- using terms such as good, bad, terrible...
Probing statements- demanding more information than the examinee wants to provide. asking "why" questions.
Hostile responses- directs anger toward the interviewee
False Reassurance- does nothing to help the interviewee and shows the interviewer will not help.
What is the main goal of interviewing?
to gather information
What is a transitional phrase? If it fails, what responses should be used to continue the theme?
Transitional phrase is to keep the interaction going with minimal effort (yes, i see...)
If fails, use verbatim playback, paraphrasing and restatement, summarizing, clarification and response, and empathy and understanding.
When should direct questions be used in an interview?
To get the interviewee back on topic, or to get specific information.
What are the advantages and disadvantages of using structured clinical interviews?
Advantages- everyibe gets same questions in the same order, offers reliability but sacrifices validity
Disadvantages- requires cooperation, relies exclusively on the participant making assumptions questionable.
What is the purpose of a mental status examination? What areas are typically covered?
Used to evaluate and screen psychosis, cognitive impairments etc...
Areas covered- memory, orientation etc...
Define general standoutishness. How does appearance play a role?
Judging on the basis of one outstanding characteristic. (biases judgment and prevents objective evaluation).
How much higher is interview reliability for structured interviews?
twice as high
What is a major criticism of structured interviews?
It does not provide a broad range of data.
What is social facilitation?
When one tends to act like the models around him/her. (if you projects a mood, the interviewee will respond in kind)
What is the largest source of error in interviews?
Judgment
What were the three independent research traditions identified by Taylor to study human intelligence?
1. Psychometric Approach
2. Information-Processing Approach
3. Cognitive Approach
Through what 3 facilities did Binet believe intelligence expressed itself? What two major concepts guided him?
Facilities: Judgment, Attention, Reasoning
2 Principles: Age Differentiation, General Mental Ability
What is age differentiation?
differentiating older from younger children by the former's greater capacities
What is mental age?
Equivelant age capabilities of a child regardless of his/her chronological age. Obtained through age differentiation.
What is general mental ability?
the total product of the various separate and distinct elements of intelligence. (sum of the parts)
What is positive manifold?
When a set of diverse ability tests are administered to large population samples, the correlations are positive. (if you're good at one, you're good at others)
Binet searched for tasks that could be completed by what percentage of children in a particular age group?
66.67%- 75%
What concept did Spearman introduce? What did this concept mean?
(g)- based on positive manifold in that all tests measure g (general intelligence)
What statistical method did Spearman develop to support his notion of g?
Factor Analysis: reduces the set of variables into a smaller number of factors.
According to the gf-gc theory, what are the two basic types of intelligence? How do they differ?
Fluid intelligence- ability to think, reason and acquire new information (aptitude)
Crystallized intelligence- the knowledge and understanding we have acquired. (achievement)
How do you calculate IQ?
mental age/chronological age x 100
What is a deviation IQ and how was it used in the Stanford-Binet scale?
A deviation IQ is based on standard score principle (has a mean and normal distribution. M=100, SD=15.
Define and differentiate between basal and ceiling.
Basal- base line (ex. you have to get 6 items in a row correct)
Ceiling- top (ex. you have to get 6 items in a row incorrect)
What factors did Wechsler focus on that those before him had not?
-took into account the non-intellective factors
-designed different tests for children and adults instead of just adapting them.
What were some of the criticisms of the Binet scale by Wechsler?
-didn't like a single score representing someone's "intelligence"
-thought it was too focused on verbal
Why is the inclusion of a point scale a significant improvement? What did a performance scale add?
Point Scale- allows items to be grouped together and allows analysis of of the individual's ability in a variety of items.
Performance Scale- nonverbal measure of intelligence, standardized on the same sample, attempts to overcome biases of language, culture and education.
Know and differentiate the major functions measured by each subtest of the WAIS-IV.
Vocabulary, Similarities (in what way are an ant and a tree similar?), information (what is the speed of light?), block design, matrix reasoning, visual puzzles, arithmetic, digit span, letter-number sequencing, symbol search.
What is the age range of the Wechsler scale?
about 2 to about 91
What are the mean, standard deviation, and range for scaled score, standard scores, and index scores?
Scaled Scores: M=10, SD=3, R=1-19
Standard Scores: M=100, SD=15, R=40-150
Index Scores: M=100, SD=15, R=40-150
How are the IQ scores calculated on the Wechsler?
Raw Score--Scaled Score--Sum of Scaled Scores--Compare to Norm--Standard Score
Name the Index Scores and the purposes for each.
Verbal Comprehension Index- measure of acquired knowledge and verbal reasoning (gc)
Perceptual Organization Index- measure of fluid intelligence
Working Memory Index- The information that we actively hold in our minds (not stored knowledge)
Processing Speed Index- How quickly your mind works
What is pattern analysis? What are the concerns when using such a method?
When one evaluates the large differences between subtest scaled scores. It doesn't take into account individual variability very well.
What is a hold subtest?
A subtest in which the scores holds despite injury or mental illness. (good hold subtest- vocabulary)
Which subtests are most sensitive to cerebral dysfunction? Which are considered "hold" subtests?
Cerebral dysfunctions- similarities, matrix, block design
Hold subtest- vocabulary
How would you differentiate between the WAIS-IV subtests that measure crystallized intelligence from those that measure fluid intelligence?
crystallized is anything that requires previous learning
Where do traditional intelligence tests fail in the study "normal" abilities?
It does not work well with people affected by sensory, physical, language, or social disabilities.
Does not work well in the extremes.
What are the disadvantages of alternative intelligence tests when compared to the Binet and Wechsler?
They have weaker psychometrics. The scores are not interchangeable with the Binet and Wechsler. Limits the range of functions/abilities.
What are the advantages of alternative intelligence tests when compared to the Binet and Wechsler?
Used for specific populations are purposes. Minimized verbal. Not as reliant on learning/achievement.
What theme in relation to future intelligence do you notice about infant development tests?
It does not predict later IQ. However, the Bayley does predict later mental retardation.
Compare and contrast surveillance and screening.
Surveillance is surveying the big picture. (survey a bunch of kids to pick out which ones have _____)
Screening- once you get a hit, you screen them and figure out what those problems may be.
Which two infant development tests were discussed in class? What are some disadvantages of each of these tests?
Brazelton and Bayley
Brazelton- no norms, poor test-retest reliability, does not predict later IQ
Bayley- psychometrics break down at lower ages, does not predict later IQ
Know sensitivity, specificity, false positives and true negatives.
Sensitivity- finding those who really are sick (finding true positives)
Specificity- finding only those who are sick (avoiding false negatives)
What are acceptable sensitivity and specificity levels for developmental screening tests?
between 70%-80% but not above because they are inversely related.
How is a learning disability currently defined in the school systems? Is this a good method?
Acheivement is 2 standard deviations below IQ. NOT a good method because it does not work at lower (and higher) extremes. (A 75 IQ would have to be a 45 to be classified as having a learning disorder, which is too rare)
For what was the Woodcock-Johnson-III designed?
To define learning disabilities based on the discrepancy IQ.
Should test scores alone be used to define developmental or learning disabilities?
NO. It should be more individual (response to intervention)