Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
150 Cards in this Set
- Front
- Back
Accommodation
|
adaptation (to a purpose); adjustment, adaptation for test taker
|
|
Achievement test
|
type of ability test that measures what one has learned. (e.g. survey battery tests, diagnostic test, readiness tests)
|
|
Age equivalent
|
a measure of a person's ability, skill, or knowledge, expressed in terms of the age at which the average person attains that level of performance
|
|
Aptitude
|
A combination of characteristics, whether native or acquired, that are indicative of an individual’s ability to learn or to develop proficiency in some particular area if appropriate education or training is provided
|
|
Aptitude test
|
a test that measures what one is capable of learning (e.g. intelligence tests, cognitive ability tests)
|
|
Age Norms
|
The distribution of test scores by age of test takers
|
|
Anecdotal Data
|
generally includes behaviors of an individual that are consistent or inconsistent; may assist in the assessment process
|
|
Bias
|
systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others
|
|
Ceiling
|
an upper usually prescribed limit
|
|
Classic test theory
|
the assumption that every test instrument has measurement error and that the true score falls between the observed score plus or minus the measurement error
|
|
Composite score
|
the average of your four test scores, rounded to the nearest whole number
|
|
Confidence interval
|
a group of continuous or discrete adjacent values that is used to estimate a statistical parameter (as a mean or variance) and that tends to include the true value of the parameter a predetermined proportion of the time if the process of finding the group of values is repeated a number of times
|
|
Construct
|
to set in logical order
|
|
Correlation
|
The degree of relationship between two sets of scores
|
|
Correlation coefficient
|
the relationship between two sets of scores
|
|
Criterion-referenced assessment (competency-based assessment)
|
tests designed to provide info about the specific knowledge or skills possessed by a student; a method of scoring in which test scores are compared to a predetermined value or a set of criterion
|
|
Criterion-Referenced (Content-Referenced) Test
|
describe tests that are designed to provide information about the specific knowledge or skills possessed by a student. Such tests usually cover relatively small units of content and are closely related to instruction. Their scores have meaning in terms of what the student knows or can do, rather than in (or in addition to) their relation to the scores made by some norm group
|
|
Cronbach’s coefficient Alpha
|
method of measuring internal consistency by calculating test reliability using all possible split half combinations
|
|
Culture-fair test
|
a test or other type of assessment designed to provide a measure of performance that is interpretable in terms of a clearly defined and delimited domain of learning tasks
|
|
Cultural Bias
|
when people of a culture make assumptions about conventions, then mistake these assumptions for laws of logic or nature. (in Intelligence Testing: by doing some intelligence tests which make non-mainstream cultural assumptions)
|
|
Cut score
|
used to determine the minimum performance level needed to pass a competency test; a specified point on a score scale, such that scores at or above that point are interpreted or acted upon differently from scores
|
|
Demographics
|
the statistical data of a population, esp. those showing average age, income, education, etc.
|
|
Deviation IQ (DIQ)
|
An age-based index of general mental ability. It is based on the difference between a person’s score and the average score for persons of the same chronological age. Standard score with a mean of 100 and a SD of 15
|
|
Deviation Score
|
The score for an individual minus the mean score for the group; i.e., the amount a person deviates from the mean
|
|
Decile
|
any one of nine numbers that divide a frequency distribution into 10 classes such that each contains the same number of individuals
|
|
Dependent variable
|
a mathematical variable whose value is determined by that of one or more other variables in a function; what is measured
|
|
Derived score
|
score obtained by comparing an individual’s score to the norm group by converting his score to a percentile or standard score. E.g. IQ scores, stanine scores, sten scores, T scores, and z scores (applying a mathematical computation to raw score)
|
|
Disaggregate
|
to separate into component parts
|
|
Diagnostic test
|
test that assesses problem areas of learning. Often used to assess learning disabilities. (a type of achievement test)
|
|
Discrimination parameter
|
the distance between the two absorbing boundaries and therefore the amount of information that has to be collected before a response to an item can be given.
|
|
Distribution
|
the position, arrangement, or frequency of occurrence over an area or throughout a space or unit of time; the natural geographic range of an organism
|
|
Equal interval scale
|
A scale of measurement in which differences between values can be quantified in absolute terms but the zero point is fixed arbitrarily; e.g. Fahrenheit or Celsius temperature scales and calendar dates
|
|
Equivalent Forms
|
Any of two or more forms of a test that are closely parallel with respect to content and the number and difficulty of the items included. Also called parallel or alternate forms
|
|
Environmental Assessment
|
a naturalistic and systems approach to assessment in which practitioners collect information about clients from their home, work, or school environments
|
|
Error of Measurement
|
The amount by which the score actually received (an observed score) differs from a hypothetical true score.
|
|
Factor/ Factor analysis
|
statistically examining the relationship between subscales and the larger construct. Used to measure construct validity
|
|
Floor
|
a minimum limit
|
|
Formative assessment
|
a self-reflective process that intends to promote student attainment; the bidirectional process between teacher and student to enhance, recognize and respond to the learning; provides feedback during process for the purpose of improving instruction
|
|
Frequency
|
number of times a given score (or a set of scores in an interval grouping) occurs in a distribution
|
|
Frequency distribution
|
a tabulation of the values (scores) that one or more variables taken in a sample; a method of understanding scores by ordering from high to low and listing the corresponding frequency of each score across from it
|
|
Functional Behavior Assessment
|
a problem-solving process for addressing student problem behavior; looks beyond the behavior itself
|
|
Flynn Effect
|
the rise of average IQ test scores over the generations, an effect seen in most parts of the world, although at greatly varying rates
|
|
GAF (Global Assessment of Functioning)
|
a numeric scale (0 through 100) used by mental health clinicians and doctors to rate the social, occupational and psychological functioning of adults
|
|
Grade Level Equivalent (GLE)
|
The school grade level for a given population for which a given score is the median score in that population
|
|
Grade Equivalent (G.E.)
|
a type of standard score calculated by comparing an individual’s score to the average score of others at the same grade level
|
|
Heritability
|
the extent to which genetic individual differences contribute to individual differences in observed behavior; the proportion of phenotypic variance attributable to genetic variance
|
|
Item Analysis
|
The process of examining students’ responses to test items to judge the quality of each item. The difficulty and discrimination indices are frequently used in this process
|
|
Independent variable
|
those that are manipulated
|
|
Informed consent
|
a legal condition whereby a person can be said to have given consent based upon a clear appreciation and understanding of the facts, implications and future consequences of an action
|
|
Inter-rater reliability
|
the consistency with which two or more judges rate the work or performance of test takers
|
|
Interrater
|
two or more judges rate the work or performance of test takers
|
|
Inventory
|
list of traits, preferences, attitudes, interests, or abilities used to evaluate personal characteristics or skills
|
|
Internal Consistency
|
a method of determining reliability of an instrument by looking within the test itself, or not going “outside of the test” to determine a reliability estimate as is done with test-retest or parallel forms of reliability. (e.g. split-half)
|
|
Kuder-Richardson reliability
|
a statistical analysis of each test item against all of the other test items
|
|
Kurtosis
|
degree of peakedness of a distribution
|
|
Likert Scale
|
a graphic-type rating scale that has a statement followed by words that reflect a continuum that range from favorable to unfavorable regarding the quality being measured. A number line may or may not be used with the words
|
|
Longitudinal
|
repeated observation or examination of a set of subjects over time with respect to one or more study variables
|
|
Metacognition
|
awareness or analysis of one's own learning or thinking processes
|
|
Multiaxial classification system
|
A procedure used in DSM-IV-TR for diagnosing patients on five axes
|
|
Mastery Level
|
The cutoff score on a criterion-referenced or mastery test. People who score at or above the cutoff score are considered to have mastered the material
|
|
Mean
|
average of a set of scores
|
|
Median
|
The middle score in a distribution or set of ranked scores; the point (score) that divides a group into two equal parts; the 50th percentile. Half the scores are below the median, and half are above it
|
|
Mode
|
The score or value that occurs most frequently in a distribution
|
|
N
|
The symbol commonly used to represent the number of cases in a group
|
|
Normal Curve Equivalents (NCEs)
|
Normalized standard scores 99 equal units, with a mean of 50 and a standard deviation of 21.06
|
|
Normal Distribution
|
A distribution of scores or other measures that in graphic form has a distinctive bell-shaped appearance. Measures are distributed symmetrically about the mean
|
|
Norms
|
The distribution of test scores of some specified group called the norm group
|
|
Norm-Referenced Test
|
Any test in which the score acquires additional meaning by comparing it to the scores of people in an identified norm group. A test can be both norm- and criterion-referenced
|
|
Outlier
|
a statistical observation that is markedly different in value from the others of the sample
|
|
Population
|
a body of persons or individuals having a quality or characteristic in common
|
|
Probability
|
the ratio of the number of outcomes in an exhaustive set of equally likely outcomes that produce a given event to the total number of possible outcomes; the chance that a given event will occur
|
|
p-Value
|
The proportion of people in an identified norm group who answer a test item correctly; a.k.a. Difficulty Index
|
|
Percentile
|
A point on the norms distribution below which a certain percentage of the scores fall
|
|
Percentile Rank
|
The percentage of scores falling below a certain point on a score distribution. (Percentile and percentile rank are sometimes used interchangeably)
|
|
Pre-test
|
A preliminary test administered to determine a student's baseline knowledge or preparedness for an educational experience or course of
|
|
Post-test
|
A test given after a lesson or a period of instruction to determine what was learned
|
|
Qualitative assessment
|
based on system knowledge, experience, and judgment; it is usually a verbal report
|
|
Quantitative assessment
|
employ mathematical models, theories and/or hypotheses pertaining to natural phenomena
|
|
Quartile
|
One of three points that divided the scores in a distribution into four groups of equal size. The first quartile, or 25th percentile, separates the lowest fourth of the group; the middle quartile, the 50th percentile or median, divides the second fourth of the cases from the third; and the third quartile, the 75th percentile, separates the top quarter
|
|
Range
|
a sequence, series, or scale between limits; the limits of a series: the distance or extent between possible extremes; the difference between the least and greatest values
|
|
Raw Score
|
an untreated score; an observed score on a test (the number correct); an individual's actual score before being adjusted for relative position in the test group (raw scores tell us little or nothing)
|
|
Regression
|
tendency of a posttest score (or a predicted score) to be closer to the mean of its distribution than the score is to the mean of its distribution; a functional relationship between two or more correlated variables that is often empirically determined from data and is used especially to predict values of one variable when given values of the others
|
|
Regression Line
|
y=mx+b a line that is drawn through a scatterplot of two variables. It is chosen because it comes as close to the points as possible
|
|
Reliability
|
the extent to which test scores are consistent, dependable, and repeatable; the degree to which the test scores are dependable or relatively free from random errors of measurement
|
|
Random
|
often used in statistics to signify well-defined statistical properties, such as a lack of bias or correlation
|
|
Reliability Coefficients
|
Estimated by correlation between scores on two equivalent forms of a test, by the correlation between scores on two administrations of the same test, or through procedures known as internal-consistency estimates
|
|
Split Half
|
(or odd-even): this method of internal consistency reliability splits the test in half and correlates the scores of one half of the test with the other half. Hence, it requires only one form and one administration of the test
|
|
Split-half reliability
|
the correlation between odd numbered items on a test with even numbered items
|
|
Sample
|
a finite part of a statistical population whose properties are studied to gain information about the whole; a representative part or a single item from a larger whole or group
|
|
Scaled scores
|
a mathematical transformation of a raw score; scaled scores are useful when comparing test results over time
|
|
Scales of measurement
|
ways of defining the attribute of numbers and how they can be manipulated. (nominal, ordinal, interval, ratio)
|
|
Skew
|
scores that do not fall along the normal curve
|
|
Standard score
|
derived by converting a raw score to a new score that has a new mean and new SD. Make test results easier to read
|
|
Standardized sample
|
a large sample of test takers who represent the population for which the test is intended. A.k.a. norm/ing group
|
|
Summative assessment
|
the assessment of the learning and summarizes the development of learners at a particular time. The test aims to summarize learning up to that point
|
|
School Ability Index (SAI)
|
(from the Otis-Lennon School Ability Test) normalized standard score with a mean of 100 and SD of 16
|
|
Standard Deviation (S.D.)
|
A measure of the variability, or dispersion, of a distribution of scores. The more the scores cluster around the mean, the smaller the SD
|
|
Standard Error of Measurement (SEM)
|
An estimate of where an individual’s “true” score actually lies due to the measurement error of the test; provides an estimate or range where a person’s score would fall if taking a test over and over again
|
|
Standardized Test
|
a test in which the same or similar tasks or questions are given under the same conditions to all test takers (with the exception of those with disabilities) and scored the same way
|
|
Stanines
|
9-point normalized standard score scale with a mean of 5 and a standard deviation of 2. Only the integers 1 to 9 occur
|
|
Test-retest reliability
|
a statistical method used to examine how reliable a test is: A test is performed twice
|
|
Test-Retest
|
giving the test twice to the same group of people, and then correlating the scores of the first test with those of the second test to determine the reliability of the instrument
|
|
T-Score
|
A standard score with a mean of 50 and a standard deviation of 10
|
|
Triarchic Theory
|
Robert Sternberg - research of human intelligence. Sternberg’s definition of human intelligence is “(a) mental activity directed toward purposive adaptation to, selection and shaping of, real-world environments relevant to one’s life” which means that intelligence is how well an individual deals with environmental changes throughout their lifespan. Sternberg’s theory comprises three parts: componential, experiential, and practical
|
|
True Score
|
A score entirely free of error; a hypothetical value that can never be obtained by testing, since a test score always involves some measurement error. A person’s "true" score may be thought of as the average of an infinite number of measurements from the same or exactly equivalent tests, assuming no practice effect or change in the examinee during the testing
|
|
Validity
|
extent to which a test does the job for which it is intended. What the instrument measures and how well it does it
|
|
Face validity
|
a test looks like it is measuring what it is supposed to measure. Not a true form of validity
|
|
Content validity
|
extent to which the content of the test represents a balanced and adequate sampling of the outcomes about which inferences are to be made; the degree of evidence shows items and questions represent the proper domain
|
|
Construct validity
|
The extent to which a test measures some relatively (hypothetical) abstract trait or construct
|
|
Criterion-related validity
|
The extent to which scores on the test are in agreement with (concurrent validity) or predict (predictive validity) some criterion measure
|
|
Predictive validity
|
the accuracy with which a test is indicative of performance on a future criterion measure
|
|
Concurrent validity
|
demonstrated where a test correlates well with a measure that has previously been validated; the extent in which an instrument correlates with an outcome criterion in the present
|
|
Variability
|
The spread of dispersion of test scores, most often expressed as a standard deviation
|
|
Variance
|
The square of the standard deviation
|
|
Weighting
|
The process of assigning different weights to different scores in making some final decision
|
|
Z-Score
|
A type of standard score with a mean of zero and a standard deviation of one
|
|
Charles Spearman
|
Believed in a two factor approach to intelligence that included a general factor (g) and a specific factor (s), both of which he considered important in understanding intelligence
|
|
LL Thurstone
|
Developed a multifactor approach or model of intelligence that included seven primary factors: verbal meaning, number ability, word fluency, perception speed, spatial ability, reasoning, and memory
|
|
Howard Gardner
|
vehemently opposed current constructs of intelligence measurements and developed his own theory of multiple intelligences asserting that there are 8 or 9 intelligences: verbal-linguistic, mathematical-logical, musical, visual-spatial, bodily-kinesthetic, interpersonal, intrapersonal, naturalist, and existential intelligence
|
|
Robert J. Sternberg
|
Research on human intelligence. His Triarchic Theory was among the first to go against the psychometric approach to intelligence and take a more cognitive approach. Sternberg’s definition of human intelligence is “(a) mental activity directed toward purposive adaptation to, selection and shaping of, real-world environments relevant to one’s life” which means that intelligence is how well an individual deals with environmental changes throughout their lifespan. Sternberg’s theory comprises three parts: componential, experiential, and practical
|
|
Alfred Binet
|
Commissioned by the Ministry of Public Education in Paris in 1904 to develop an intelligence test to assist in the integration of the “subnormal” children into the schools. His work led to the development of the 1st modern-day intelligence test
|
|
James Cattell
|
One of the earliest psychologists to use statistical concepts to understand people. His main emphasis became testing mental function, and he is known for coining the term “mental test”
|
|
David Wechsler
|
a leading American psychologist. He developed well-known intelligence scales, such as the Wechsler Adult Intelligence Scale (WAIS) and the Wechsler Intelligence Scale for Children (WISC)
|
|
Henry Goddard
|
prominent American psychologist and eugenicist in the early 20th century. 1st to translate the Binet intelligence test into English in 1908 and distributing an estimated 22,000 copies of the translated test across the United States; he also introduced the term "moron" into the field. He was the leading advocate for the use of intelligence testing in societal institutions including hospitals, schools, the legal system and the military
|
|
Herman Rorschach
|
a student of Carl Jung, he created the Rorschach Inkblot test by splattering ink onto sheets of paper and folding them in half. He believed the interpretation of an individual’s reactions to these forms could tell volumes about the individual’s unconscious mind
|
|
Henry Murray
|
Developed the Thematic Apperception Test (TAT), which asks a subject to view a number of standard pictures and create a story to explain the situation as he or she understands it. This test is based on his needs-press theory
|
|
Barnum Effect
|
(The Forer effect, or personal validation fallacy) after P. T. Barnum's observation that "we've got something for everyone") is the observation that individuals will give high accuracy ratings to descriptions of their personality that supposedly are tailored specifically for them, but are in fact vague and general enough to apply to a wide range of people. The Forer effect can, assuming their actual falsity, provide a partial explanation for the widespread acceptance of some pseudosciences such as astrology and fortune telling, as well as many types of personality tests
|
|
Marshmallow Test: (Deferred gratification or delayed gratification)
|
is the ability to wait in order to obtain something that one wants. This ability is usually considered to be a personality trait which is important for life success. Daniel Goldman has suggested that it is an important component of emotional intelligence. People who lack this trait are said to need instant gratification and may suffer from poor impulse control
|
|
Bimodel
|
two numbers which are most frequently occurring
|
|
Central Limit Theorem
|
mathematical assumption that for population over 30 the distribution will approach a normal curve
|
|
Item Response Theory
|
examining items individually for their ability to discriminate
|
|
Rti
|
School based research that indicates early intervention and data based decision making
|
|
Goal Attainment Scaling
|
a tool used when specific goals are jointly set by the client and the counselor with a time for expected achievement/outcome
|
|
Statistic
|
a quantity calculated from a sample
|
|
Idiographic
|
Unique characteristics to a person
|
|
Nomothetic
|
has to do with universal characteristics (things we would all have in common). Measured by standardized tests
|
|
Ipsative
|
comparing a person with him/herself and how they change over time
|
|
Bimodal
|
two numbers which are most frequently occurring
|
|
Spearman-Brown Formula
|
The Spearman-Brown prediction formula (also known as the Spearman-Brown prophecy formula) is a formula relating psychometric reliability to test length and used by to predict the reliability of a test after changing the test length (used with split-half)
|
|
MAST
|
The Michigan Alcohol Screening Test
|
|
Parameter
|
a value usually unknown which represents a certain population
|
|
Discrepancy Model
|
determines need for support based on the differences between achievement and ability (school based)
|
|
Interquartile Range
|
Difference between the first quartile (25th percentile) and the third quartile (75th percentile) of an ordered range of data. It contains middle 50 percent of the distribution and is unaffected by extreme values
|
|
Coefficient of Determination
|
common factors that account for a relationship; when the correlation coefficient or Pearson’s r is squared (r2) is called the coefficient of determination.
|
|
Sten Score
|
normalized standard scores similar to stanines but stens range from 1 to 10, have a mean of 5.5 and have a standard deviation of 2. Sten scores of 5-6 are considered average, 1 to 3 falls in the low range and stens of 8 to 10 fall in the high range
|
|
Scatterplot
|
Graph showing two or more sets of test scores
|
|
Cattell and Fluid Intelligence
|
intelligence is the culture-free portion of intelligence that is inborn and unaffected by new learning
|
|
Crystallized Intelligence
|
crystallized intelligence is acquired as we learn and is affected by our experiences, schooling, culture, and motivation
|
|
Vernon’s Hierarchical Model
|
one of the greatest and most widely adopted models of intelligence. Vernon believed that subcomponents of intelligence could be added in a hierarchical manner to obtain a cumulative (g) factor score. The model comprises four levels, with factors from each lower level contributing to the next level on the hierarchy (chart on page 143)
|
|
Guildford’s Multifactor/Multidimensional Model
|
model of intelligence (represented as a cube and has 180 factors), involves three kinds of cognitive ability: operations, or the general intellectual processes we use in understanding, content, or what w use to perform our thinking process; and the products, or how we apply our operations to our content. Different mental abilities will require different combinations of processes, contents, and products (chart on page 143)
|