150 Cards in this Set
Accommodation

adaptation (to a purpose); adjustment, adaptation for test taker


Achievement test

type of ability test that measures what one has learned. (e.g. survey battery tests, diagnostic test, readiness tests)


Age equivalent

a measure of a person's ability, skill, or knowledge, expressed in terms of the age at which the average person attains that level of performance


Aptitude

A combination of characteristics, whether native or acquired, that are indicative of an individual’s ability to learn or to develop proficiency in some particular area if appropriate education or training is provided


Aptitude test

a test that measures what one is capable of learning (e.g. intelligence tests, cognitive ability tests)


Age Norms

The distribution of test scores by age of test takers


Anecdotal Data

generally includes behaviors of an individual that are consistent or inconsistent; may assist in the assessment process


Bias

systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others


Ceiling

an upper usually prescribed limit


Classic test theory

the assumption that every test instrument has measurement error and that the true score falls between the observed score plus or minus the measurement error


Composite score

the average of your four test scores, rounded to the nearest whole number


Confidence interval

a group of continuous or discrete adjacent values that is used to estimate a statistical parameter (as a mean or variance) and that tends to include the true value of the parameter a predetermined proportion of the time if the process of finding the group of values is repeated a number of times


Construct

to set in logical order


Correlation

The degree of relationship between two sets of scores


Correlation coefficient

the relationship between two sets of scores


Criterionreferenced assessment (competencybased assessment)

tests designed to provide info about the specific knowledge or skills possessed by a student; a method of scoring in which test scores are compared to a predetermined value or a set of criterion


CriterionReferenced (ContentReferenced) Test

describe tests that are designed to provide information about the specific knowledge or skills possessed by a student. Such tests usually cover relatively small units of content and are closely related to instruction. Their scores have meaning in terms of what the student knows or can do, rather than in (or in addition to) their relation to the scores made by some norm group


Cronbach’s coefficient Alpha

method of measuring internal consistency by calculating test reliability using all possible split half combinations


Culturefair test

a test or other type of assessment designed to provide a measure of performance that is interpretable in terms of a clearly defined and delimited domain of learning tasks


Cultural Bias

when people of a culture make assumptions about conventions, then mistake these assumptions for laws of logic or nature. (in Intelligence Testing: by doing some intelligence tests which make nonmainstream cultural assumptions)


Cut score

used to determine the minimum performance level needed to pass a competency test; a specified point on a score scale, such that scores at or above that point are interpreted or acted upon differently from scores


Demographics

the statistical data of a population, esp. those showing average age, income, education, etc.


Deviation IQ (DIQ)

An agebased index of general mental ability. It is based on the difference between a person’s score and the average score for persons of the same chronological age. Standard score with a mean of 100 and a SD of 15


Deviation Score

The score for an individual minus the mean score for the group; i.e., the amount a person deviates from the mean


Decile

any one of nine numbers that divide a frequency distribution into 10 classes such that each contains the same number of individuals


Dependent variable

a mathematical variable whose value is determined by that of one or more other variables in a function; what is measured


Derived score

score obtained by comparing an individual’s score to the norm group by converting his score to a percentile or standard score. E.g. IQ scores, stanine scores, sten scores, T scores, and z scores (applying a mathematical computation to raw score)


Disaggregate

to separate into component parts


Diagnostic test

test that assesses problem areas of learning. Often used to assess learning disabilities. (a type of achievement test)


Discrimination parameter

the distance between the two absorbing boundaries and therefore the amount of information that has to be collected before a response to an item can be given.


Distribution

the position, arrangement, or frequency of occurrence over an area or throughout a space or unit of time; the natural geographic range of an organism


Equal interval scale

A scale of measurement in which differences between values can be quantified in absolute terms but the zero point is fixed arbitrarily; e.g. Fahrenheit or Celsius temperature scales and calendar dates


Equivalent Forms

Any of two or more forms of a test that are closely parallel with respect to content and the number and difficulty of the items included. Also called parallel or alternate forms


Environmental Assessment

a naturalistic and systems approach to assessment in which practitioners collect information about clients from their home, work, or school environments


Error of Measurement

The amount by which the score actually received (an observed score) differs from a hypothetical true score.


Factor/ Factor analysis

statistically examining the relationship between subscales and the larger construct. Used to measure construct validity


Floor

a minimum limit


Formative assessment

a selfreflective process that intends to promote student attainment; the bidirectional process between teacher and student to enhance, recognize and respond to the learning; provides feedback during process for the purpose of improving instruction


Frequency

number of times a given score (or a set of scores in an interval grouping) occurs in a distribution


Frequency distribution

a tabulation of the values (scores) that one or more variables taken in a sample; a method of understanding scores by ordering from high to low and listing the corresponding frequency of each score across from it


Functional Behavior Assessment

a problemsolving process for addressing student problem behavior; looks beyond the behavior itself


Flynn Effect

the rise of average IQ test scores over the generations, an effect seen in most parts of the world, although at greatly varying rates


GAF (Global Assessment of Functioning)

a numeric scale (0 through 100) used by mental health clinicians and doctors to rate the social, occupational and psychological functioning of adults


Grade Level Equivalent (GLE)

The school grade level for a given population for which a given score is the median score in that population


Grade Equivalent (G.E.)

a type of standard score calculated by comparing an individual’s score to the average score of others at the same grade level


Heritability

the extent to which genetic individual differences contribute to individual differences in observed behavior; the proportion of phenotypic variance attributable to genetic variance


Item Analysis

The process of examining students’ responses to test items to judge the quality of each item. The difficulty and discrimination indices are frequently used in this process


Independent variable

those that are manipulated


Informed consent

a legal condition whereby a person can be said to have given consent based upon a clear appreciation and understanding of the facts, implications and future consequences of an action


Interrater reliability

the consistency with which two or more judges rate the work or performance of test takers


Interrater

two or more judges rate the work or performance of test takers


Inventory

list of traits, preferences, attitudes, interests, or abilities used to evaluate personal characteristics or skills


Internal Consistency

a method of determining reliability of an instrument by looking within the test itself, or not going “outside of the test” to determine a reliability estimate as is done with testretest or parallel forms of reliability. (e.g. splithalf)


KuderRichardson reliability

a statistical analysis of each test item against all of the other test items


Kurtosis

degree of peakedness of a distribution


Likert Scale

a graphictype rating scale that has a statement followed by words that reflect a continuum that range from favorable to unfavorable regarding the quality being measured. A number line may or may not be used with the words


Longitudinal

repeated observation or examination of a set of subjects over time with respect to one or more study variables


Metacognition

awareness or analysis of one's own learning or thinking processes


Multiaxial classification system

A procedure used in DSMIVTR for diagnosing patients on five axes


Mastery Level

The cutoff score on a criterionreferenced or mastery test. People who score at or above the cutoff score are considered to have mastered the material


Mean

average of a set of scores


Median

The middle score in a distribution or set of ranked scores; the point (score) that divides a group into two equal parts; the 50th percentile. Half the scores are below the median, and half are above it


Mode

The score or value that occurs most frequently in a distribution


N

The symbol commonly used to represent the number of cases in a group


Normal Curve Equivalents (NCEs)

Normalized standard scores 99 equal units, with a mean of 50 and a standard deviation of 21.06


Normal Distribution

A distribution of scores or other measures that in graphic form has a distinctive bellshaped appearance. Measures are distributed symmetrically about the mean


Norms

The distribution of test scores of some specified group called the norm group


NormReferenced Test

Any test in which the score acquires additional meaning by comparing it to the scores of people in an identified norm group. A test can be both norm and criterionreferenced


Outlier

a statistical observation that is markedly different in value from the others of the sample


Population

a body of persons or individuals having a quality or characteristic in common


Probability

the ratio of the number of outcomes in an exhaustive set of equally likely outcomes that produce a given event to the total number of possible outcomes; the chance that a given event will occur


pValue

The proportion of people in an identified norm group who answer a test item correctly; a.k.a. Difficulty Index


Percentile

A point on the norms distribution below which a certain percentage of the scores fall


Percentile Rank

The percentage of scores falling below a certain point on a score distribution. (Percentile and percentile rank are sometimes used interchangeably)


Pretest

A preliminary test administered to determine a student's baseline knowledge or preparedness for an educational experience or course of


Posttest

A test given after a lesson or a period of instruction to determine what was learned


Qualitative assessment

based on system knowledge, experience, and judgment; it is usually a verbal report


Quantitative assessment

employ mathematical models, theories and/or hypotheses pertaining to natural phenomena


Quartile

One of three points that divided the scores in a distribution into four groups of equal size. The first quartile, or 25th percentile, separates the lowest fourth of the group; the middle quartile, the 50th percentile or median, divides the second fourth of the cases from the third; and the third quartile, the 75th percentile, separates the top quarter


Range

a sequence, series, or scale between limits; the limits of a series: the distance or extent between possible extremes; the difference between the least and greatest values


Raw Score

an untreated score; an observed score on a test (the number correct); an individual's actual score before being adjusted for relative position in the test group (raw scores tell us little or nothing)


Regression

tendency of a posttest score (or a predicted score) to be closer to the mean of its distribution than the score is to the mean of its distribution; a functional relationship between two or more correlated variables that is often empirically determined from data and is used especially to predict values of one variable when given values of the others


Regression Line

y=mx+b a line that is drawn through a scatterplot of two variables. It is chosen because it comes as close to the points as possible


Reliability

the extent to which test scores are consistent, dependable, and repeatable; the degree to which the test scores are dependable or relatively free from random errors of measurement


Random

often used in statistics to signify welldefined statistical properties, such as a lack of bias or correlation


Reliability Coefficients

Estimated by correlation between scores on two equivalent forms of a test, by the correlation between scores on two administrations of the same test, or through procedures known as internalconsistency estimates


Split Half

(or oddeven): this method of internal consistency reliability splits the test in half and correlates the scores of one half of the test with the other half. Hence, it requires only one form and one administration of the test


Splithalf reliability

the correlation between odd numbered items on a test with even numbered items


Sample

a finite part of a statistical population whose properties are studied to gain information about the whole; a representative part or a single item from a larger whole or group


Scaled scores

a mathematical transformation of a raw score; scaled scores are useful when comparing test results over time


Scales of measurement

ways of defining the attribute of numbers and how they can be manipulated. (nominal, ordinal, interval, ratio)


Skew

scores that do not fall along the normal curve


Standard score

derived by converting a raw score to a new score that has a new mean and new SD. Make test results easier to read


Standardized sample

a large sample of test takers who represent the population for which the test is intended. A.k.a. norm/ing group


Summative assessment

the assessment of the learning and summarizes the development of learners at a particular time. The test aims to summarize learning up to that point


School Ability Index (SAI)

(from the OtisLennon School Ability Test) normalized standard score with a mean of 100 and SD of 16


Standard Deviation (S.D.)

A measure of the variability, or dispersion, of a distribution of scores. The more the scores cluster around the mean, the smaller the SD


Standard Error of Measurement (SEM)

An estimate of where an individual’s “true” score actually lies due to the measurement error of the test; provides an estimate or range where a person’s score would fall if taking a test over and over again


Standardized Test

a test in which the same or similar tasks or questions are given under the same conditions to all test takers (with the exception of those with disabilities) and scored the same way


Stanines

9point normalized standard score scale with a mean of 5 and a standard deviation of 2. Only the integers 1 to 9 occur


Testretest reliability

a statistical method used to examine how reliable a test is: A test is performed twice


TestRetest

giving the test twice to the same group of people, and then correlating the scores of the first test with those of the second test to determine the reliability of the instrument


TScore

A standard score with a mean of 50 and a standard deviation of 10


Triarchic Theory

Robert Sternberg  research of human intelligence. Sternberg’s definition of human intelligence is “(a) mental activity directed toward purposive adaptation to, selection and shaping of, realworld environments relevant to one’s life” which means that intelligence is how well an individual deals with environmental changes throughout their lifespan. Sternberg’s theory comprises three parts: componential, experiential, and practical


True Score

A score entirely free of error; a hypothetical value that can never be obtained by testing, since a test score always involves some measurement error. A person’s "true" score may be thought of as the average of an infinite number of measurements from the same or exactly equivalent tests, assuming no practice effect or change in the examinee during the testing


Validity

extent to which a test does the job for which it is intended. What the instrument measures and how well it does it


Face validity

a test looks like it is measuring what it is supposed to measure. Not a true form of validity


Content validity

extent to which the content of the test represents a balanced and adequate sampling of the outcomes about which inferences are to be made; the degree of evidence shows items and questions represent the proper domain


Construct validity

The extent to which a test measures some relatively (hypothetical) abstract trait or construct


Criterionrelated validity

The extent to which scores on the test are in agreement with (concurrent validity) or predict (predictive validity) some criterion measure


Predictive validity

the accuracy with which a test is indicative of performance on a future criterion measure


Concurrent validity

demonstrated where a test correlates well with a measure that has previously been validated; the extent in which an instrument correlates with an outcome criterion in the present


Variability

The spread of dispersion of test scores, most often expressed as a standard deviation


Variance

The square of the standard deviation


Weighting

The process of assigning different weights to different scores in making some final decision


ZScore

A type of standard score with a mean of zero and a standard deviation of one


Charles Spearman

Believed in a two factor approach to intelligence that included a general factor (g) and a specific factor (s), both of which he considered important in understanding intelligence


LL Thurstone

Developed a multifactor approach or model of intelligence that included seven primary factors: verbal meaning, number ability, word fluency, perception speed, spatial ability, reasoning, and memory


Howard Gardner

vehemently opposed current constructs of intelligence measurements and developed his own theory of multiple intelligences asserting that there are 8 or 9 intelligences: verballinguistic, mathematicallogical, musical, visualspatial, bodilykinesthetic, interpersonal, intrapersonal, naturalist, and existential intelligence


Robert J. Sternberg

Research on human intelligence. His Triarchic Theory was among the first to go against the psychometric approach to intelligence and take a more cognitive approach. Sternberg’s definition of human intelligence is “(a) mental activity directed toward purposive adaptation to, selection and shaping of, realworld environments relevant to one’s life” which means that intelligence is how well an individual deals with environmental changes throughout their lifespan. Sternberg’s theory comprises three parts: componential, experiential, and practical


Alfred Binet

Commissioned by the Ministry of Public Education in Paris in 1904 to develop an intelligence test to assist in the integration of the “subnormal” children into the schools. His work led to the development of the 1st modernday intelligence test


James Cattell

One of the earliest psychologists to use statistical concepts to understand people. His main emphasis became testing mental function, and he is known for coining the term “mental test”


David Wechsler

a leading American psychologist. He developed wellknown intelligence scales, such as the Wechsler Adult Intelligence Scale (WAIS) and the Wechsler Intelligence Scale for Children (WISC)


Henry Goddard

prominent American psychologist and eugenicist in the early 20th century. 1st to translate the Binet intelligence test into English in 1908 and distributing an estimated 22,000 copies of the translated test across the United States; he also introduced the term "moron" into the field. He was the leading advocate for the use of intelligence testing in societal institutions including hospitals, schools, the legal system and the military


Herman Rorschach

a student of Carl Jung, he created the Rorschach Inkblot test by splattering ink onto sheets of paper and folding them in half. He believed the interpretation of an individual’s reactions to these forms could tell volumes about the individual’s unconscious mind


Henry Murray

Developed the Thematic Apperception Test (TAT), which asks a subject to view a number of standard pictures and create a story to explain the situation as he or she understands it. This test is based on his needspress theory


Barnum Effect

(The Forer effect, or personal validation fallacy) after P. T. Barnum's observation that "we've got something for everyone") is the observation that individuals will give high accuracy ratings to descriptions of their personality that supposedly are tailored specifically for them, but are in fact vague and general enough to apply to a wide range of people. The Forer effect can, assuming their actual falsity, provide a partial explanation for the widespread acceptance of some pseudosciences such as astrology and fortune telling, as well as many types of personality tests


Marshmallow Test: (Deferred gratification or delayed gratification)

is the ability to wait in order to obtain something that one wants. This ability is usually considered to be a personality trait which is important for life success. Daniel Goldman has suggested that it is an important component of emotional intelligence. People who lack this trait are said to need instant gratification and may suffer from poor impulse control


Bimodel

two numbers which are most frequently occurring


Central Limit Theorem

mathematical assumption that for population over 30 the distribution will approach a normal curve


Item Response Theory

examining items individually for their ability to discriminate


Rti

School based research that indicates early intervention and data based decision making


Goal Attainment Scaling

a tool used when specific goals are jointly set by the client and the counselor with a time for expected achievement/outcome


Statistic

a quantity calculated from a sample


Idiographic

Unique characteristics to a person


Nomothetic

has to do with universal characteristics (things we would all have in common). Measured by standardized tests


Ipsative

comparing a person with him/herself and how they change over time


SpearmanBrown Formula

The SpearmanBrown prediction formula (also known as the SpearmanBrown prophecy formula) is a formula relating psychometric reliability to test length and used by to predict the reliability of a test after changing the test length (used with splithalf)


MAST

The Michigan Alcohol Screening Test


Parameter

a value usually unknown which represents a certain population


Discrepancy Model

determines need for support based on the differences between achievement and ability (school based)


Interquartile Range

Difference between the first quartile (25th percentile) and the third quartile (75th percentile) of an ordered range of data. It contains middle 50 percent of the distribution and is unaffected by extreme values


Coefficient of Determination

common factors that account for a relationship; when the correlation coefficient or Pearson’s r is squared (r2) is called the coefficient of determination.


Sten Score

normalized standard scores similar to stanines but stens range from 1 to 10, have a mean of 5.5 and have a standard deviation of 2. Sten scores of 56 are considered average, 1 to 3 falls in the low range and stens of 8 to 10 fall in the high range


Scatterplot

Graph showing two or more sets of test scores


Cattell and Fluid Intelligence

intelligence is the culturefree portion of intelligence that is inborn and unaffected by new learning


Crystallized Intelligence

crystallized intelligence is acquired as we learn and is affected by our experiences, schooling, culture, and motivation


Vernon’s Hierarchical Model

one of the greatest and most widely adopted models of intelligence. Vernon believed that subcomponents of intelligence could be added in a hierarchical manner to obtain a cumulative (g) factor score. The model comprises four levels, with factors from each lower level contributing to the next level on the hierarchy (chart on page 143)


Guildford’s Multifactor/Multidimensional Model

model of intelligence (represented as a cube and has 180 factors), involves three kinds of cognitive ability: operations, or the general intellectual processes we use in understanding, content, or what w use to perform our thinking process; and the products, or how we apply our operations to our content. Different mental abilities will require different combinations of processes, contents, and products (chart on page 143)
