Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
34 Cards in this Set
- Front
- Back
Describe
Measurement Validity |
-about a specific aspect/measure (whereas internal & external validity is about an entire study)
-are you mzring what you mean to be mzring |
|
Describe
Reliability |
consistency of a series of measurements;
-IF our outcome mzr doesn't provide reliable data, then we can't accurately assess our results |
|
T/F You can have a measure that is valid but not reliable
|
FALSE but you can have reliability w/o validity
|
|
What is an observed score?
|
any score that we obtain from any individual on a particular instrument
The X in X= T+/-E |
|
Reliability gets talked about in ______ and validity gets talked about in terms of ________
|
Reliability - numbers
Validity - adjectives |
|
What is the Standard Error of Measurement?
|
s√(1-r)
s=SD r= reliability coefficient |
|
Establishing Reliability:
Types of tests |
Test-Retest
Parallel Forms Internal Consistency Interrator (Interobserver) |
|
Describe
Test-Retest Reliability |
Coefficient of Stability; retest after a long period
If the reliability coefficient high, then a test has good test-retest reliability -caveats: carry-over |
|
What is the
Correlation Coefficient |
Ranges from -1 to 1; usually used to evaluate reliability
|
|
Describe
Parallel Forms Reliability |
-dvlpd to try to reduce carryover effects of retesting
-parallel form used (coefficient of equivalence) |
|
Describe
Internal Consistency Reliability |
examines whether an instrument is consistent in measuring a single concept or construct
Types: Split-Half Kuder-Richardson 20 Cronbach's Alpha |
|
Describe
Split-Half Method |
Used to establish internal consistency
-uses Spearman-Brown formula -involve correlating 2 halves of the same test (responses for 1st half to 2nd half; randomly sample part of test) |
|
Describe
Kuder-Richardson |
method used to establish internal consistency
-used to determine how all of the items are related to each other; only for tests w/ 2 options/dichotomous |
|
Describe
Cronbach's Alpha |
method used to establish internal consistency reliability
-used when test items have multiple choices, such as Likert scale -most commonly used index of reliability in edu & psych research |
|
If something has a reliability of -.75 what do you say?
How about 1.2? |
Throw them both out; should be between 0-1.
|
|
Can you adequately assess the reliability of a measure if only Cronbach's alpha is provided but nothing else?
|
Nope; we need to know #of dimensions or another index of reliability
|
|
Describe
Interrater (Interobserver) Reliability |
When observation is the method of collecting data, this is used to establish reliability
Types: Percentage Agreement Methods Intraclass Correlation Coefficients Kappa |
|
Describe
Percentage Agreement Methods |
Involves having 2 or more raters, prior to the study, observe a sample of bxs similar to what would be observed in the study
-prob: observers must agree a bx was elicited a # of times; doesn't mean they are right -use a point-by-point way to establish reliability |
|
Describe
Intraclass Correlation Coefficients |
Data needs to be interval (like degree of cooperation)
type of interobserver reliabilty |
|
Describe
Kappa |
type of interobserver reliability
-used when data is nominal -data is often dichotomous |
|
The bigger my reliability (r) the smaller my error will be. And the less reliable my measure is (r) the greater my error will be.
--> what do we do to estimate the range of expected scores? |
Make a confidence interval
|
|
What is the equation for
Confidence Interval |
X+/- (SEM)(Z)
Z= z score SEM= SD√(1-reliabilty coefficient) X= observed score |
|
Define
Generalizability Theory |
extension of Classical Test Theory
-lets investigator est the diff components of mzrmnt error |
|
Define
Item Response Theory |
Has the researcher separate test characteristics from participant characteristics
-provides info about reliability as a funciton of ability instead or avg'ing overall ability levels |
|
What are the 5 types of evidence that support the validity of a test/measure?
|
Content
Response Processes Internal Structure Relations to other variables Consequences of testing |
|
Evidence based on the
Content of the Measure |
-ecological validity (is the task the same/similar to ability being mzrd)
-does an instrument accurately rep the major aspect of a concept |
|
Evidence based on the
Response Processes |
- extent to which the types of participant responses match intended construct
-includes exam of responses of observers, raters, or judges to determine appropriateness of the criteria |
|
Evidence based on
Internal Structure |
-analysis (such as factor or differential item functioning) useful
|
|
Describe
Factor Analysis |
looks at factors being measured; can be used to provide evi based on internal structure when a construct is complex
|
|
Evidence based on
Relations to other Variables |
includes categories of criterion related validity & construct validity
-test-criterion relationships -predicitive-criterion evidence -concurrent-criterion evidence -convergent & discriminant evidence -validity generalization |
|
Define
Constructs |
hypothetical concepts that can't be observed directly (depression, etc)
|
|
Can one study demonstrate validity for a set of measures for several different populations & different purposes?
|
Nope. Need several studies.
|
|
Evidence based on
Consequences of Testing |
includes both + & - anticipated & unanticipated consequences of mzrmnt
|
|
Define
Convergent & Discriminant Evidence |
convergent - evidence that something measure what it is intended to
discriminant - tests whether constructs that shouldn't be related are in fact measured as unrelated |