Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Test Construction for EPPP

Test Construction For Eppp

by maryplummer9, Apr. 2009

Subjects: construction eppp for test

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/28

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

28 Cards in this Set

Front
Back

	Reliability coefficients range from	Reliability coefficients range from 0-1.0
	What does classical test theory boil down to?	CLASSICAL TEST THEORY: Reliability means 1) a test yields repeatable, consistent results 2)a test is reliable to the degree that your score reflects the true score on the test rather than error
	What does a reliability coefficient of .90 indicate?	90% of the observed variability is d/t true score differences and the remaining 10% is due to measurement error.
	Which do you square to interpret? a) the reliability coefficient b) the correlation coefficient?	square the correlation coefficient to interpret it i.e., to determine the proportion of variability that’s shared between 2 measures
	Test-retest reliability isn’t appropriate for…	attributes that change over time, e.g., mood.
	How do you derive an alternate forms reliability coefficient?	ALTERNATE FORMS RELIABILITY COEFFICIENT administer 2 alternate forms of a test to the same group of examines (Form A at time 1, then Form B at time 2), then obtain the correlation b/w the 2 sets of scores. So, everyone completes Form A and Form B. reduces practice effects
	What does Internal Consistency Reliability measure?	measures the correlations among individual items in a test
	What are the 3 different methods of determining the coefficient of internal consistency?	1) Split-half 2) cronbach’s coefficient alpha (for multiple scored items, e.g., likert scales) 3) Kuder-Richardson formula 20 (for dichotomously scored right-wrong)
	What’s the kappa coefficient for?	a measure of inter-rater reliability
	The standard error of measurement is different than a reliability coefficient in that…	the standard error of measurement is used to determine the CONFIDENCE INTERVAL for an INDIVIDUAL test score. Whereas the reliability coefficient represents how much error a whole TEST contains
	How do you calculate a 68, 95, and 99% CI given a standard error of the measurement of 4.0, and a score of 110 on an IQ test?	68%CI = Score +/- 1x(Stan Err of Measurement) 95%CI = Score +/- 2x(Stan Err of Measurement) 99%CI = Score +/- 2.5x(Stan Err of Measurement) 68%CI = 110 +/- 4 = 106-114 95%CI = 110 +/- 8 = 102-118 99%CI = 110 +/- 10 = 100-120
	How does a decrease in variability of scores impact reliability coefficient of a test?	it DECREASES reliability
	Floor effects result from too many a) easy items b) difficult items	FLOOR EFFECT = too many difficult questions
	A ceiling effect results from too many a) easy items b) difficult items	CEILING EFFECT = too many easy questions
	How is content validity different than construct validity?	CONTENT validity asks ”how much does this test adequately and representatively sample the context area?” and it’s based on careful judgment & selection of items covering all content domains, and/or good convergent or criterion-related validity Construct validity asks ”how much does this test measure the a theoretical construct or trait?” and it’s assessed over time as data is accumulated and test of convergent/divergent validity are made, and/or factor analyses.
	Criterion-related validity is…	the correlation between the predictor (e.g., the SATs) and the criterion (what it’s supposed to predict, e.g., college GPA)
	What’s the difference b/w concurrent validity and predictive validity?	concurrent = predictor and criterion scores are collected concurrently. predictive = predictor scores are collected first and criterion data are collected later.
	The multitrait-multimethod matrix is one way of assessing a test’s….	convergent and divergent validity that is, how much a given test correlates with another test that measures the same construct and how much it doesn’t correlate with a test designed to measure another construct
	What’s a factor analysis for?	it measures the degree to which a set of tests are all measuring the same underlying factor, or construct.
	In factor analysis, what is a factor loading?	it’s a particular test’s correlation with a particular factor that’s been found in the factor analysis.
	In factor analysis, what’s the purpose of rotating factors? What are the two types of rotations? When is it done?	It facilitates the interpretation of the analysis. orthogonal rotation (resulting in uncorrelated factors), and oblique (resulting in correlated factors) factors are rotated as the final step in a factor analysis.
	What’s the difference between the standard error of the estimate vs the standard error of the measurement?	the standard error of the ESTIMATE tells us how much error is in the estimated/predicted CRITERION score (e.g., if using SATs to predict GPA, this tells us how off our prediction of GPA might be) standard error of the MEASUREMENT tells us how much error is in the TEST score itself (e.g., where an examinees TRUE test score is likely to fall on the SATs)
	How do you calculate a standard error of the measurement?	Standard Error of the Measurement = SD √(1 - reliability coefficient)
	When the criterion-related validity of a test is moderated by a variable, the test is said to have…	differential validity
	What’s shrinkage?	SHRINKAGE the reduction that occurs in a criterion-related validity coefficient upon cross-validation (i.e., when a test is developed and validated with an initial sample, and then tested again using only the retained items within a second sample)
	What is an EIGENVALUE?	EIGENVALUE 1) applies to factor analysis, 2) is used to describe the FACTORS, not the particular tests 3) it is the amount of variance across all tests that is accounted for by the factor. In other words, an eigenvalue tells us how significant or big each factor is.
	If for a particular item, the p value is .80, what does this mean?	80% of test takers get the question RIGHT
	How hard should test items be to maximize their value in discriminating between high and low scoring test takers?	moderately difficult, p=.50

Share This Flashcard Set

Set the Language

Test Construction For Eppp

Add to Folders

Upgrade to Cram Premium

Card Range To Study

28 Cards in this Set