• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/140

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

140 Cards in this Set

  • Front
  • Back
The earliest recorded use of procedures resembling psychological testing is:
A) United States, circa 1850 AD
B) Rome, circa 200 BC
C) China, circa 200 BC
D) Incan Empire, circa 1400 AD
China, circa 200 BC
The first intelligence test capable of measuring a general intelligence level was:
A) Binet-Simon
B) Hermann Ebbinghaus
C) Emil Kraeplin
D) Weber-Fechner
Binet-Simon
What is the definition of the term “battery?”
A) a group of items that pertains to a single variable, arranged in order of difficulty or intensity
B) a group of several tests or subtests that are administered at one time to one person
C) a tool designed to elicit information about a person’s motivations, preferences, attitudes, interests, and opinions
D) the numerical system used to rate or to report value on some measured dimension
a group of several tests or subtests that are administered at one time to one person
Who was the first creator of the laboratory dedicated to research of a purely psychological nature?
A) James McKeen Cattell
B) Emil Kraeplin
C) Sir Francis Galton
D) Wilhelm Wundt
Wilhelm Wundt
*According to Testing Standards, “the ultimate responsibility for appropriate test use and interpretation” is primarily assigned to:
A) test authors and developers
B) test publishers
C) test score interpreters
D) test user
test user
*The Woodworth Personal Data Sheet, the first personality test, was used to screen World War I recruits that might suffer from?
A) Dyslexia
B) Attention Deficit Disorder
C) Mental illness
D) Fear of heights
Mental illness
The 1905 _________ was a series of 30 tests or tasks, varied in content and difficulty, designed mostly to assess judgment and reasoning ability irrespective of school learning.
A) Ebbinghaus Completion Test
B) Stanford-Binet Intelligence Scale
C) Scholastic Aptitude Test
D) Binet-Simon Scale
Binet-Simon Scale
The primary reason for test misuse lies in the insufficient
A) publication of tests
B) knowledge of test users
C) instruments available to test users
D) use of the test manual by the examiner
knowledge of test users
A psychological assessment differs from a psychological test in that
A) psychological testing is more complex
B) psychological assessment is objective
C) psychological assessment is longer and more unique
D) psychological testing is very expensive
psychological assessment is longer and more unique
A standardized or normative sample is:
A) a sample in which all the participants have at least one similar characteristic
B) a sample taken after a test has been completed to analyze the results of the test
C) a sample taken in order to gauge the performance of others who will later take the test
D) a sample that produces results representing the normal curve
a sample taken in order to gauge the performance of others who will later take the test
Which characteristic made the Minnesota Multiphasic Personality Inventory (MMPI) more successful than previous personality inventories?
A) many of its items had no obvious reference to psychopathological tendencies
B) it included 116 statements to which the respondent answered simply “yes” or “no”
C) it no longer used the empirical criterion keying technique
D) it was more user-friendly
many of its items had no obvious reference to psychopathological tendencies
*Standardization of psychological tests refers to measurement based on ________and ________.

A)normal curve & repetition of results
B)normal curve & unbiased analysis
C)normative samples & uniform procedure method
D)normative samples & consistent subjects
E)normal curve & absence of open-ended questions
normative samples & uniform procedure method
*A technique, based on correlation, for reducing a large number of variables to a small set of factors is:
A) scaling
B) kurtosis
C) factor analysis
D) sampling
factor analysis
*The Rorschach Test is a type of:
A) personality inventory
B) neuropsychological test
C) thematic apperceptual technique
D) projective technique
projective technique
What came into being following the realization that intelligence is not a unitary concept and that human abilities comprise a broad range of independent factors?
A) scholastic aptitude tests
B) neuropsychological tests
C) multiple aptitude batteries
D) personality inventories
multiple aptitude batteries
Who was responsible for promoting the field of eugenics, discovering the phenomena of correlation and regression, and also pioneered the twin study method?
A) Alfred Binet
B) Kurt Goldstein
C) Robert Yerkes
D) Francis Galton
Francis Galton
*What is the basic definition of a battery?
A) a process of arriving at the sequencing of items
B) a group of several tests administered at one time
C) a group of items that pertains to a single variable
D) a tool designed to elicit information about a person’s motivations, preferences, and other stimuli
a group of several tests administered at one time
*Which is the primary use of psychological tests?
A) decision making
B) psychological research
C) self understanding and personal development
D) making predictions
decision making
*The SAT is based loosely on the historical model of the
A) Stanford-Binet intelligence scale
B) Woodworth Personal Data Sheet
C) Wechsler Intelligence Scale for Children
D) Army Alpha Test
Army Alpha Test
*A test with scores that can range from 0-100 has a distribution with most scores in the 70’s-90’s is said to be which of the following:
a. positively skewed
b. linearly distributed
c. normally distributed
d. negatively skewed
negatively skewed
*The interquartile range of a distribution is:
a. the top 25% of the distribution
b. the bottom 25% of the distribution
c. the bottom 50% of the distribution
d. the middle 50% of the distribution
the middle 50% of the distribution
If a test is written intending to measure achievement of college-level students and the distribution is negatively skewed, what should be done with the test?
a. the test should be made more difficult
b. the test should remain the same
c. the test should be made easier
d. the test should be readministered
the test should be made more difficult
*_________ refers to measures derived from sample data, while measures derived from population data are __________.
a. Parameters, statistics
b. Samples, parameters
c. Statistics, parameters
d. Constants, parameters
Statistics, parameters
*The ratio IQ (MA/CA) ratio was not very effective because their intellectual development is far less __________ from year to year.
a. uniform
b. dynamic
c. irregular
d. measurable
uniform
What was the main problem with ratio IQ scores used with the original Stanford-Binet Intelligence Scale?
a. the math was too complicated for psychologists to compute
b. the ratio simply did not work for adolescents and adults
c. the measurement of the ratio IQ scores was considered unethical
d. there were no problems with the ratio IQ scores used in the original Stanford-Binet
the ratio simply did not work for adolescents and adults
Through a study of 400 high school students, the College Board finds that 60% of high school students wish to attain a higher educational degree. This is an example of:
a. descriptive parameter
b. ordinal percentage
c. statistic
d. inferential parameter
statistic
What is the interquartile range of a set of data?
a. the range of all four quarters of the data
b. four times the semi-interquartile data
c. the range of the middle two quarters of data
d. half of the semi-interquartile data
the range of the middle two quarters of data
*The distance between a value and the mean of a distribution, expressed in terms of the standard deviation is represented by:
a. Pearson r
b. Median
c. Z-score
d. Correlation coefficient
Z-score
A percentile score is an example of which type of scale
a. ratio
b. ordinal
c. interval
d. nominal
ordinal
*Which measure of central tendency is useful when dealing with qualitative or categorical variables?
a. mean
b. median
c. mode
d. range
mode
Who was responsible for promoting the field of eugenics, discovering the phenomena of correlation and regression, and also pioneered the twin study method?
A) Alfred Binet
B) Kurt Goldstein
C) Robert Yerkes
D) Francis Galton
Francis Galton
What is the basic definition of a battery?
A) a process of arriving at the sequencing of items
B) a group of several tests administered at one time
C) a group of items that pertains to a single variable
D) a tool designed to elicit information about a person’s motivations, preferences, and other stimuli
a group of several tests administered at one time
*Which is the primary use of psychological tests?
A) decision making
B) psychological research
C) self understanding and personal development
D) making predictions
decision making
*The SAT is based loosely on the historical model of the
A) Stanford-Binet intelligence scale
B) Woodworth Personal Data Sheet
C) Wechsler Intelligence Scale for Children
D) Army Alpha Test
Army Alpha Test
*A test with scores that can range from 0-100 has a distribution with most scores in the 70’s-90’s is said to be which of the following:
a. positively skewed
b. linearly distributed
c. normally distributed
d. negatively skewed
negatively skewed
*The interquartile range of a distribution is:
a. the top 25% of the distribution
b. the bottom 25% of the distribution
c. the bottom 50% of the distribution
d. the middle 50% of the distribution
the middle 50% of the distribution
If a test is written intending to measure achievement of college-level students and the distribution is negatively skewed, what should be done with the test?
a. the test should be made more difficult
b. the test should remain the same
c. the test should be made easier
d. the test should be readministered
the test should be made more difficult
*_________ refers to measures derived from sample data, while measures derived from population data are __________.
a. Parameters, statistics
b. Samples, parameters
c. Statistics, parameters
d. Constants, parameters
Statistics, parameters
*The ratio IQ (MA/CA) ratio was not very effective because their intellectual development is far less __________ from year to year.
a. uniform
b. dynamic
c. irregular
d. measurable
uniform
What was the main problem with ratio IQ scores used with the original Stanford-Binet Intelligence Scale?
a. the math was too complicated for psychologists to compute
b. the ratio simply did not work for adolescents and adults
c. the measurement of the ratio IQ scores was considered unethical
d. there were no problems with the ratio IQ scores used in the original Stanford-Binet
the ratio simply did not work for adolescents and adults
Through a study of 400 high school students, the College Board finds that 60% of high school students wish to attain a higher educational degree. This is an example of:
a. descriptive parameter
b. ordinal percentage
c. statistic
d. inferential parameter
statistic
What is the interquartile range of a set of data?
a. the range of all four quarters of the data
b. four times the semi-interquartile data
c. the range of the middle two quarters of data
d. half of the semi-interquartile data
the range of the middle two quarters of data
*The distance between a value and the mean of a distribution, expressed in terms of the standard deviation is represented by:
a. Pearson r
b. Median
c. Z-score
d. Correlation coefficient
Z-score
A percentile score is an example of which type of scale
a. ratio
b. ordinal
c. interval
d. nominal
ordinal
*Which measure of central tendency is useful when dealing with qualitative or categorical variables?
a. mean
b. median
c. mode
d. range
mode
*A characteristic of the stanine scale is:
a. it is complex
b. it is expensive
c. it lacks precision
d. it is not time efficient
it lacks precision
Alternate forms, anchor tests, fixed reference groups and simultaneous norming are all types of:
a. nonlinear transformations
b. equating procedures
c. item response testing
d. performance assessment
equating procedures
Local norms are characterized by:
a. groups formed in terms of age, sex, ethnicity, or any other variable that may significantly impact test scores
b. reference groups drawn from members of a specific, more narrowly defined population or institution
c. groups of people who simply happen to be available at the time the test is being constructed
d. subgroups formed after a test has been standardized and published, to expand the test’s applicability
reference groups drawn from members of a specific, more narrowly defined population or institution
If a fifth grader scores at the eighth grade level in arithmetic, it means that:
a. the student’s score is significantly above the average for fifth graders in arithmetic
b. the student has mastered eighth grade arithmetic
c. the same as if a first grader scored at the fourth grade level, or if a ninth grader scored at the twelfth grade level
d. the grade level standards are too low
the student’s score is significantly above the average for fifth graders in arithmetic
Which of the following demonstrates the difference between percentiles and percentage scores?
a. percentiles reflect an individual’s number of correct responses, while percentage scores reflect the individual’s rank
b. the frame of reference for percentiles is the content of the entire test, while the frame of reference for percentage scores is other people
c. percentiles reflect an individual’s rank in reference to other people, while percentage scores reflect an individual’s performance in reference to the entire test
d. percentiles represent qualitative data, while percentage scores represent quantitative data
percentiles reflect an individual’s rank in reference to other people, while percentage scores reflect an individual’s performance in reference to the entire test
*Which statement clearly distinguishes between three terms that are often used interchangeably?
a. “reference group” identifies a more specific group of test subjects than a “standardization sample” or a “normative sample”
b. “standardization sample” is the first group to receive the test, whereas the “normative sample” is any group from which norms are gathered
c. “reference group” may be a “standardization sample” but cannot be a “normative sample”
d. “reference group” may be a “normative sample” but cannot be a “standardized sample”
“standardization sample” is the first group to receive the test, whereas the “normative sample” is any group from which norms are gathered
The changing of the reference group standards for the College Board’s SAT score scale is called:
a. anchor testing
b. variating
c. equivocating
d. re-centering
re-centering
*The higher level of performance typically seen in the normative groups of newer versions of general intelligence tests compared to their older counterparts is known as the:
a. deviation from the mean
b. Flynn effect
c. Standard deviation
d. Correlational effect
Flynn effect
*The foremost requirement of the normative sample is:
a. to be sufficiently large enough to ensure stability of variables
b. to be recent
c. to have the demographic makeup of the nation’s population
d. to be representative of the individuals to be tested
to be representative of the individuals to be tested
Behavioral sequences:
a. can be converted into nominal scales
b. cannot be used normatively
c. depend on an orderly progression from one state to another
d. can only be based on chronological age
depend on an orderly progression from one state to another
*Deviation IQ’s were first introduced in 1939 for use in the:
a. Otis-Lennon School Ability Test
b. SAT
c. Wechsler-Bellevue
d. Kaufmann Adolescent and Adult Intelligence Test
Wechsler-Bellevue
A ________ expresses the distance between a raw score and the mean of the reference group in terms of the standard deviation of the reference group.
a. t-score
b. z-score
c. percentile score
d. grade-equivalent score
z-score
*If a person scores lower than any of the people in the normative sample, the problem is one of:
a. insufficient test ceiling
b. the test was too easy for the individual
c. overly large normative sample
d. insufficient test floor
insufficient test floor
*Which is used when a score distribution approximates but does not quite match the normal distribution?
a. linear transformation
b. nonlinear transformation
c. normalized standard scores
d. stanines
normalized standard scores
If a test taker reaches the test ceiling on a test, then:
a. the test taker is labeled a genius
b. the test taker must retake the test
c. the test is insufficient
d. the test was wrongly scored
the test is insufficient
*The Gesell Developmental Schedules and the Infant-Toddler Developmental Assessment have this in common:
a. they were both developed by Arnold Gesell
b. both were tested and edited at Yale
c. they both use ordinal scaling
d. they derived from Piaget’s stages of development
they both use ordinal scaling
*Reliability of scores _________ as the error component __________.
a. decreases; remains constant
b. remains constant; increases
c. increases; decreases
d. decreases; decreases
increases; decreases
What two things does reliability in measurement imply?
a. consistency and precision
b. consistency and relatedness
c. precision and relatedness
d. consistency and validity
consistency and precision
*Evidence of score reliability is __________ validity.
a. unrelated to
b. sufficient for
c. necessary and sufficient for
d. necessary but not sufficient for
necessary but not sufficient for
*Traits are considered ________ characteristics, while states are referred to as ________.
a. stable; enduring
b. temporary; static
c. stable; temporary
d. temporary; shortlived
stable; temporary
The Spearman-Brown formula is related to the idea that:
a. a larger number of observations yields more reliable results
b. reliable results do not rely on the number of observations
c. a smaller set of observations is quicker to make
d. true scores are derived from long tests
a larger number of observations yields more reliable results
*The KR-20 or alpha coefficients are good indicators of _________ in a test.
a. homogeneity
b. spiral-omnibus formats
c. heterogeneity
d. alternate forms
homogeneity
True scores are:
a. equivalent to the test taker’s observed score
b. the observed score subtracted from the raw score
c. hypothetical scores that would result from error-free measurement
d. normative sample scores of the given distribution
hypothetical scores that would result from error-free measurement
The test-retest reliability coefficient tells us:
a. extent to which scores will fluctuate as a result of time sampling error
b. extent to which scores will fluctuate as a result of scorer reliability
c. reliability of the interitem inconsistency and content heterogeneity combined
d. reliability of the content heterogeneity by itself
extent to which scores will fluctuate as a result of time sampling error
Low reliability estimates suggest that:
a. the test is too short
b. the test is too long
c. not enough data from the normative sample was analyzed
d. the test is not very trustworthy
the test is not very trustworthy
*Theoretically, if an individual took the same test an infinite number of times, his/her mean score would represent his/her:
a. true score
b. reliability coefficient
c. error component
d. observed score
true score
Reliability and error component are:
a. not related at all
b. positively related
c. inversely related
d. negatively related
inversely related
_________ is used in determining the consistency of mental tests, that is, the repeatability of their results. It evaluates sources of error and the sizes of those errors.
a. Measurement error
b. The reliability coefficient
c. The true score
d. Heterogeneity
The reliability coefficient
The phrase “all other things being equal” should serve to:
a. alert the reader to the possibility that several other things do need to be considered besides the specific concept in question
b. show that all the aspects of a certain concept are in fact equal
c. alert the reader that there may be items in a particular concept that are not equal
d. show the reader that there is no need to consider other aspects related to a concept because the are all the same
alert the reader to the possibility that several other things do need to be considered besides the specific concept in question
Three sources of measurement error with typical reliability include:
a. time sampling error, inconsistency, alternate form
b. homogeneity, time sampling error, off balancing
c. content sampling error, performance error, group diversity
d. interscorer difficulties, time sampling error, content sampling error
interscorer difficulties, time sampling error, content sampling error
If all the test score variance were true variance, score reliability would be:
a. 1.00
b. -1.00
c. 100
d. -100
1.00
*What is one of the most frequently used formulas to calculate interitem consistency?
a. Cronbach’s Alpha
b. Pearson R formula
c. Spearman-Brown formula
d. Standard Error of Measurement formula
Cronbach’s Alpha
If an alternate form of a test is given shortly after taking the original form, there is likely to be:
a. heightened reliability
b. test-retest reliability
c. significant practice effects
d. the Flynn effect
significant practice effects
*The standard error of measurement (SEM) of Test A is 3 and the SEM of Test B is 5. If you wanted to compare these scores, the standard error of the difference would be:
a. 34
b. √34
c. more than 34
d. √8
e. 15
√34
Item 1: 2 x 8 = ____
Item 2: 5 x 6 = ____
Item 3: 4 x 10 = ____

This problem set can be described as:
a. a low coefficient alpha
b. a low interitem consistency
c. very heterogeneous
d. very homogeneous
very homogeneous
The most appropriate measure used to estimate error for tests scored with a degree of subjectivity would be:
a. scorer reliability
b. test-retest reliability
c. alternate form reliability
d. delayed alternate form reliability
scorer reliability
Which of the following allows for the evaluation of the interaction effects from different types of error sources?
a. heterogeneity theory
b. internal conflict theory
c. standard error theory
d. generalizability theory
generalizability theory
Which of the following is true of a reliability coefficient?
a. the higher the coefficient the better
b. test users must have a coefficient of .85 or higher
c. test users must have a coefficient of .65 or higher
d. there is a minimum threshold for a reliability coefficient to be considered adequate for all purposes
the higher the coefficient the better
Which of the following methods for estimating score reliability is prone to practice effects?
a. split half technique
b. Cronbach’s alpha
c. Alternate form reliability
d. Interval methods
Alternate form reliability
When a test is purposefully designed to include items that are diverse in terms of one or more dimensions, KR-20 and coefficient alpha will __________.
a. underestimate content sampling error
b. overestimate content sampling error
c. round content sampling error to the nearest tenth
d. most accurately calculate content sampling error
overestimate content sampling error
In test score data obtained from a large sample under standardized conditions, measurement error:
a. is eliminated and is no longer an issue
b. is assumed to be distributed at random
c. is more likely to influence scores in a positive direction than a negative one
d. is the same across samples, regardless of their composition and testing circumstances
is assumed to be distributed at random
*If the class’s scores on a reading comprehension test varied due to individual familiarity of some passages, the most useful procedure for estimating this error would be:
a. scorer reliability
b. test-retest reliability
c. alternate form reliability
d. true score reliability
alternate form reliability
Which of the following best demonstrates the benefits of delayed alternate form reliability?
a. it eliminates the confounding variable of practice effects that are problematic with coefficient alpha
b. it yields the same results, regardless of the duration between the two test administrations
c. it estimates the effect that lengthening or shortening a test will have on the obtained reliability coefficient
d. it provides a good method for estimating time and content sampling error with a single coefficient
it provides a good method for estimating time and content sampling error with a single coefficient
Which of the following is a true statement about the evaluation of reliability data?
a. small differences in the magnitude of coefficients of different tests are greatly significant
b. reliability estimates above 0.50 suggest that the scores derived from a test are trustworthy
c. estimates of error may of may not generalize to groups of test takers other than the original sample
d. if a test is comprised of subtests, reliability estimates for the total test are the same as those of each subtest
estimates of error may of may not generalize to groups of test takers other than the original sample
________ is a statistic that represents the standard deviation of the hypothetical distribution if a subject were to take a test an infinite number of times.
a. standard error of the mean
b. standard error of measurement
c. standard error of the variance
d. standard error of the difference between two scores
standard error of measurement
Score reliability is considered to be a necessary, but not significant, condition for:
a. validity
b. accuracy
c. recency
d. significance
validity
A major disadvantage of G theory is:
a. it is more comprehensive and thus less accurate
b. it is overly used by the psychological testing population today
c. it requires multiple observations from the same group
d. it does not allow for the evaluation of interaction effects
it requires multiple observations from the same group
_________ are hypotheticals and do not really exist.
a. True scores
b. Observed scores
c. Error score components
d. Raw scores
True scores
The IRT model emphasizes the ____________.
a. use for small scale testing
b. use of the whole test for reliability of error measurement
c. less precise responses by test takers
d. use of the individual test items for reliability and error measurement
use of the individual test items for reliability and error measurement
Which of the following formulas is based on the idea that larger samples yield more reliable results, and is applied to rhh to obtain an estimate for the full portion of a split half test?
a. Cronbach’s alpha
b. Kuder-Richardson formula
c. Spearman-Brown formula
d. Standard error of measurement formula
Spearman-Brown formula
Time sampling error is the most likely to occur when measuring:
a. verbal ability
b. personality traits
c. personality states
d. psychological constructs related to ability
personality states
Which is true about Generalizability Theory?
a. it is often applied to developing new instruments
b. it requires multiple observations of the same group
c. it is not often used because it does not evaluate interaction effects of different error sources
d. it is used to measure test reliability by comparing several tests
it requires multiple observations of the same group
If the scoring of a test involves subjective judgment:
a. an estimate of time sampling error is essential
b. the availability of alternate forms of the test is necessary
c. test selection decisions must be made on a case by case basis
d. scorer reliability must be taken into account
scorer reliability must be taken into account
*With item response theory methods, reliability and measurement error are approached from the point of view of:
a. information function of individual test items
b. the test as a whole
c. the trait assessed by the test
d. the standard error of measurement formula that creates a confidence interval
information function of individual test items
*Delayed alternate form reliability coefficients can be used to evaluate ______ and ______ reliability.
a. interitem; content
b. content; time
c. time; test-retest
d. interscorer; interitem
content; time
To avoid administering the same test twice or developing alternate forms, ________ reliability can be used to test content consistency.
a. delayed alternate form
b. interrater
c. split half
d. test-retest
split half
Statistically significant differences may not necessarily be:
a. reliable
b. representative of true scores
c. psychologically significant
d. valid
psychologically significant
*When comparing a computer adaptive test (CAT) with a traditional test using item response theory:
a. the CAT is less reliable than a traditional test
b. the CAT can be shorter than the traditional test while still remaining reliable
c. a traditional test is more reliable and therefore more valid
d. a traditional test is more reliable because it eliminates the error involved with technology
the CAT can be shorter than the traditional test while still remaining reliable
What is the major complaint about the WISC-III as a revision of the WISC-R?
a. The changes were mostly cosmetic and did not reorganize the test theoretically.
b. The changes departed too dramatically from the WISC-R making the WISC-III nearly unrecognizable.
c. The changes caused the reliability of the test to decrease.
d. There were actually no complaints about the WISC-III.
The changes were mostly cosmetic and did not reorganize the test theoretically.
Which of the following was not used to assess the reliability of the WIAT-II?
a. inter-rater reliability
b. test-retest reliability
c. parallel forms reliability
d. internal consistency reliability
parallel forms reliability
*Small standard error of measurement for the mathematics subtests of the WIAT-II imply:
a. smaller confidence intervals and lower reliability
b. smaller confidence intervals and higher reliability
c. larger confidence intervals and lower reliability
d. larger confidence intervals and higher reliability
smaller confidence intervals and higher reliability
Luria’s main focus was to:
a. distinguish between the three blocks
b. show the functions that can be divided into the three blocks
c. show the integration and interdependence of the three blocks
d. create a one-to-one mapping of the brain
show the integration and interdependence of the three blocks
How does the Luria model test children from different ethnicities?
a. Uses three blocks to map the brain
b. Excludes tests of acquired knowledge
c. Has many subtests
d. Includes tests of acquired knowledge
Excludes tests of acquired knowledge
*Which of the following KABC-II scales would be used to test a four year old who is deaf?
a. Knowledge
b. Planning
c. Learning
d. CHC
Learning
*What two types of tests can’t have their reliability calculated by split-half procedures?
a. spelling of sounds and punctuation
b. punctuation and compatibility
c. speeded tests and multiple point scored items
d. language and math tests
speeded tests and multiple point scored items
Which measure on the WJ-III Tests of Achievement requires the examinee, within a three-minute period, to read and comprehend simple sentences and then decide if the answer is true or false?
a. Reading Comprehension
b. Letter-Word Identification
c. Reading Fluency
d. Passage Comprehension
Reading Fluency
The WIAT-II test Math Reasoning evaluates the ability to
a. use nonverbal reasoning skills to solve abstract visual problems
b. solve single and multi-step math word problems
c. solve problems involving basic operations
d. complete a worksheet-like set of math problems of increasing difficulty
solve single and multi-step math word problems
*A distinctive feature of the Standford-Binet 5th edition is addition of
a. age-graded norms
b. non-verbal routing test
c. deviation IQ
d. composite scores for each subtest
non-verbal routing test
The first edition/model of the Stanford-Binet to introduce the new form L-M
was the

a. SB 3rd Edition
b. SB 4th Edition
c. SB 5th Edition
d. Revision of Terman’s Scale in 1937
SB 3rd Edition
The SB5 is the first intellectual battery to

a. use a deviation IQ
b. use a routing method
c. cover 5 cognitive factors
d. use the point-scale format
cover 5 cognitive factors
*David Wechsler’s main focus in creating the Wechsler Bellevue Intelligence
Scale in 1939 was to

a. provide an extensive psychological test for adults entering the
military during WW II.
b. design an instrument that would evaluate a single characteristic.
c. go beyond global IQ scores to interpret more specific aspects of an
individual’s cognitive capabilities through the analysis of subtest
scaled scores.
d. provide an alternative battery for testing children ages 2-18.
go beyond global IQ scores to interpret more specific aspects of an
individual’s cognitive capabilities through the analysis of subtest
scaled scores.
Of the following, which is not one of the broad cognitive areas that tests and clusters of the Woodcock-Johnson-III Cognitive Tests are grouped into?
a. cognitive efficiency
b. thinking ability
c. written expression
d. verbal ability
written expression
Auditory Processing in the Woodcock-Johnson-III Cognitive Tests refers to:
a. measures of processing speed of auditory stimuli
b. the ability to discriminate between similar sounding words
c. the ability to analyze, synthesize, and discriminate auditory stimuli
d. the ability to break a given whole word down into its phonemes
the ability to analyze, synthesize, and discriminate auditory stimuli
Which of the following is not included in measures of auditory processing in the Woodcock-Johnson-III?
a. the ability to process distorted speech sounds
b. the time it takes to translate individual phonemes into whole words
c. phonetic coding
d. sound blending
the time it takes to translate individual phonemes into whole words
Which of the following would be used to determine the reliability of speeded tests and tests with multiple-point scoring systems?
a. split-half procedures
b. Spearman-Brown formula
c. Rasch analysis
d. Standard error of measurement formula
Rasch analysis
Mean score & SD for T
m=50
sd=10
Mean score & SD for z
m=0
sd=1
Mean score & SD for CEEB
m=500
sd=100
Mean score & SD for IQ
m=100
sd=15
Mean score & SD for sub
m=10
sd=3
How to convert to a z score
#-mean divided by sd

X-Xbar/SD
Explain percentiles
look up
Explain confidence intervals
look up
Explain SEM
look up
Explain percentages under a curve
look up
*Any errors that occur when measuring discrete variable are due to_____.

a. measurement error
b. bias on the part of the administrator of test
c. an incorrect sample size
d. inaccurate counting
inaccurate counting
*Which of the following is true of correlation?

a. sign of the coefficient indicates the degree of relationship between 2 variables
b. high correlation implies a causal relationship between 2 variables
c. high correlation allows us to make inferences about variables' shared variance
d. the closer a correlation is to zero, the higher the degree of relationship between two variables
high correlation allows us to make inferences about variables' shared variance
*A raw score by itself:

a. can convey meaning
b. does not convey any meaning
c. can be used to make inferences about a construct
d. can give a percentile score
does not convey any meaning
*If an 8th grader scores at the 6th grade level on a reading achievement test, what does this mean? The student:

a. scored lower on the test than everyone else in his/her 8th grade class
b. scored within the same range as most of the 8th graders in his/her school
c. got a score that matches the average performance of the 6th graders in the standardization sample
d. got a score that matches the avg performance of the 6th grade population that was tested that year
got a score that matches the average performance of the 6th graders in the standardization sample
*Michael's score on a test is 60. The standard error of measurement of the test is three points. From this information, one may conclude that chances are about

a. 1 in 2 that his true score is included by the range of scores from 54 to 66
b. 99 out of 100 that his true score is included by the range of scores from 54 to 66
c. 50 out of 100 that his true is either 59, 60, or 61
d. 50-50 that the error is less than 5 points
i dunno...guess
*In a normal distribution, a score in one standard devation above the mean. What is its appropriatw percentile rank?

a. 50
b. 75
c. 84
d. 95
e. 99
84
*What percentage of z-scores fall between -3.0 and 3.0 standard deviations?

a. 50%
b. 75%
c. 95%
d. 99%
99%
*The mean of a test is 39. You get a 44 and find it is equivalent to a T score of 65. What is the standard deviation of the test?

a. 2
b. 4
c. 6
d. 10
e. 15
4
*Restrictions of range results in _____ correlation coefficients.

a. no effect on
b. higher
c. lower
d. heigher weights for
lower
*On the Weschler intelligence tests, at each age level, approximately 68% of those tested will have IQ scores between

a. 90 and 110
b. 85 and 115
c. 70 and 130
d. 55 and 145
85 and 115