Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
66 Cards in this Set
- Front
- Back
What are the three characteristics of good tests?
|
Validity
Reliability Accuracy |
|
Define validity
|
Does the test measure what it's supposed to measure?
|
|
Define reliability
|
Do the tests yield the same or similar scores?
|
|
Define accuracy
|
Does the test fairly closely approximate an individual's true level of ability, skill, or aptitude?
|
|
What can be useful for assessing content validity?
|
A test blueprint
|
|
What 2 questions does content validity answer?
|
Does this test measure what it is supposed to measure?
Does this test measure the full domain? |
|
Describe Concurrent Validity
|
A new test and an established test administered to a group at the same time
The scores are correlated The new test could be shorter or cheaper to administer |
|
What is an example of a concurrent validity test?
|
Aptitude tests
|
|
Describe Predictive Validity
|
These are tests for future behavior
You administer test and then measure success after period of time has elapsed |
|
What is an example of a test that measures predictive validity?
|
The SAT
|
|
Predictive validity tests are not only academic. They can also relate to what?
|
Career choices
|
|
Describe Construct Validity
|
Present if a test's evidence corresponds well with a theory of what you would expect
|
|
Give an example of construct validity
|
The results of testing for intelligence correlate to definition of intelligence
|
|
Which of the four types of validity is a regular classroom teacher least likely to worry about?
|
Construct validity
|
|
What is a good keyword to recognize a test that is measuring construct validity
|
EXPERIMENTAL!
|
|
What is Principle 2 of validity?
|
Group variability affects the size of the validity coefficient. Higher validity coefficients are derived from heterogeneous groups than from homogeneous groups.
|
|
Describe Test-Retest Reliability
|
One test (the same test) is given twice and the correlation between the two sets of scores is calculated
|
|
What is the problem with test-retest reliability?
|
There is usually some memory or experience involved the second time the test is taken
|
|
Describe Alternate Forms Reliability
|
two equivalent tests are given testing the reliability of the test (same conditions, same students)
|
|
What is the problem with alternate forms reliability?
|
It is hard to make 1 good test, let alone 2
|
|
Describe Internal Consistency Reliability
|
If one test is to measure a single basic concept, items should be correlated, and the test would be considered internally consistent
|
|
What are ways to determine internal consistency reliability?
|
Split-halves
Odd-Even |
|
What does Kuder-Richardson deal with?
|
Internal consistency reliability
|
|
What is Principle 1 of reliability?
|
Group variability affects the size of the reliability coefficient. Heterogeneous groups=higher reliability
|
|
What is Principle 3 of reliability?
|
Higher number of items means higher reliability.
|
|
What are the four sources of errors on tests?
|
Test Takers
Test Itself Test Administration Test Scoring |
|
How do test takers contribute to test error?
|
They may be fatigued or ill
|
|
How does the test itself contribute to test error?
|
The test might not be content valid; there could be trick questions; the reading level could be too high
|
|
How does test administration contribute to test error?
|
There could be an error in the time alloted; it could be an uncomfortable physical environment; there could be an error in the instructions
|
|
How does the test scoring contribute to test error?
|
There could be wrong answer keys or problems with filling in of bubble sheets
|
|
Is there a perfect test?
Is a test ever perfectly valid or reliable? |
NO
NO |
|
What is the obtained score?
|
The score you get back
|
|
Are the true score and error score real or theoretical?
|
They are theoretical
|
|
How do you get the obtained score in relation to the true score and error score?
|
True score + or - the error score
|
|
Define the standard error of measurement
|
The standard error of measurement is the standard deviation of the error scores of a test
|
|
How are reliability and standard error related?
|
The higher the reliability, the less the standard error
The lower the reliability, the higher the standard error |
|
Describe grade equivalent test scores
|
Most widely used in reporting results
Simple to interpret Based on actually obtained scores for students only one grade level below to one grade level above the grade being tests Any score over 1 grade level above or below is estimated |
|
Describe Age Equivalent Scores
|
Very similar to grade equivalent scores
Scores determined by each age group tested |
|
What are the problems with grade and age equivalent scores?
|
Equal differences in scores do not necessarily reflect equal differences in achievement
Meaningless unless a subject is taught across all grades Often misinterpreted as standards rather than norms |
|
Describe percentile ranks
|
Compares your score to others who took the test (local, state, national)
|
|
What are percentile ranks often confused with?
|
Percentage correct
|
|
How many percentiles are there?
|
99
|
|
What is the problem with percentile ranks?
|
Equal differences between percentile ranks do not necessarily indicate equal differences in achievement
|
|
What do standard scores allow for?
|
Comparison across subject areas
|
|
A difference of how many stanines indicates a real difference in achievement?
|
2 or more
|
|
How many portions is the normal curve divided into to determine stanines and how wide is each portion?
|
9 portions
each is 1/2 standard deviation wide |
|
What is a profile narrative report?
|
Data on left, paragraphs on right w/student's name
|
|
What is a press-on label report?
|
Just has scores. Sticker stuck to stuff. ACT score example
|
|
What is a mastery report?
|
It shows the master of concepts in each subtest
|
|
When did the use of standardized tests begin?
|
Early 20th Century
|
|
What were standardized tests in the beginning? Batteries or single-subject?
|
Single-subject
|
|
Which is the most frequently used type of standardized tests?
|
Achievement test batteries
|
|
What are the three advanatages to an achievement test battery versus a single-subject test?
|
1.) Each subtest is coordinated with every other subtest
2.) They are less expensive and less time consuming to administer 3.) Each subtest is based on the same norm group |
|
When are single subject achievement tests and diagnostic achievement tests more appropriate?
|
When they are being used to identify specifc areas of weakness for struggling students
|
|
What two things can a standardized aptitude test be used for to measure?
|
Academic potential
Career choices |
|
What contribution did Alfred Binet make to aptitude tests?
Is his test individually or group administered? |
In 1905, this Frenchman created the Binet Achievement Test for French children to see whether or not they could go to school. Only students who did well on the test were able to attend school.
Individually administered |
|
What contribution did J.M. Cattel make to aptitude tests?
|
He introduced "mental tests"
|
|
What contribution did Lewis Terman make to aptitude tests?
Is his test individually or group administered? |
He took Binets test in 1916 and adapted the use for American children. He created the Stanford-Binet test.
Individually administered |
|
What contribution did A.S. Otis make to aptitude tests?
Is his test individually or group administered? |
He made the Otis-Lennon Mental ability test.
Group administered |
|
What contribution did David Wechsler make to aptitude testing?
Is his test individually or group administered? |
He created the WISC. It is shorter than the Stanford-Binet.
It is individually administered |
|
What was Binet's definition of intelligence?
|
It is a very general trait
|
|
What was Thorndike's definition of intelligence?
|
It involves specific traits. The more tasks you give, the more input you will receive about the person's intelligence.
|
|
What was Guilford's definition of intelligence?
|
It is 3-dimensional. The dimensions are product, content, and operations. There are 120 mental abilities that make up the intelligence.
|
|
What was Davis' definition of intelligence?
|
It is 2 dimensional. Made up of 3 types of tests (verbal, quantitative, spatial)
|
|
How do you determine an IQ. What is the formula?
|
Mental Age/Chronological Age times 100
|
|
What is the newest terminology for IQ test?
|
Cognitive Skills Quotient
|