• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/46

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

46 Cards in this Set

  • Front
  • Back
Dependable, stable, consistent research is __________.
reliable
Correct, truthful, measures what is suppose to measure research is ______________.
valid
Issues of _________ and __________ are at the heart of research and clinical practice.
reliability and validity
If something is valid, is it reliable?
yes
If something is reliable, is it valid?
not necessarily
What are the 2 ways to think about reliability?
-reliability of observation and measurement of a dependent measure for research and measures from clinical data
-reliability of standardized tests
true score + error score =
observed score
what is the TRUE SCORE?
response of subject that may or may not be true performance (accurate performance of an individual, we may never know)
What is the error score?
a subject's TRUE score is related to method error + trait error
What does method error include? (fyi - researcher has control over these)
-the way the score was obtained (tools, tests, measurement device --- if an audiometer was not calibrated, counting seconds without timer...)
-procedures used - standard vs. nonstandard
-definitions used to measure behavior/performance (must be operational)
What does trait error include?
-characteristics of the person at the time of testing (can provide a questionaire that will reduce this problem)
What 2 ways are there to establish reliability?
-intra-judge
-inter-judge
Which should be higher: inter- or intra-judge reliability? AND Why?
intra-judge because standards/backgrounds/experience change between individuals
What is intra-judge reliability? and how long should a researcher wait before doing this test again?
-how consistent or accurate am I when I measure it another time?
-consistency with self

-wait 3-4 weeks to do test again
What is inter-judge reliability?
-select someone else to make same measurement on same data (how well does someone else agree with your data?)
What is the percentage of agreement or unit-by-unit agreement minimums for intra- AND inter-judge reliability?
-intra-- 90% minimum
-inter-- 80% minimum
What are characteristics of inter-judge reliability? (8)
-reliability depends on judges and their ability to rate characteristics
-may need to train judges
-some time should elapse between ratings
-reliability is conducted on a small portion of data (10, 15, 20%)
-several ways measurement reliability can be improved
-find judges who know what to measure
-reliability will improve if you rate one piece at a time
-give judges unlimited time to go over measurements (can go over mulitple times)
True or false: If computer is doing the judgments, there does not need to be reliability tests.
True
What are the 5 ways to document measurement reliability?
1. overall percentage of agreement
2. correlational
3. unit-by-unit agreement index
4. kappa statistic
5. difference scores
What is overall percentage of agreement?
-general agreement among judges
-not as accurate, more of a disguise
-very easy to calculate
What is correlational reliability?
-What is the relationship between 2 variables?
-how are they changing relative to one another?
-most deceiving way to reflect reliability
-simple to calculate
What is unit-by-unit agreement?
-agreements divided by agreements + disagreements x 100
-looks at each specific point of reference
-more accurate, if off just a little in agreement, does not reflect sufficient data
What is kappa statistic?
-expresses agreement in terms of the chance that people agreed or disagreed that something was or was not present
-uncommon in SLP
-takes a lot of time to calculate
-most precise measurement
-tells when something is present and when something is not present
-takes guessing into account
What is the difference score method?
-calculates difference in the 2 measures across data and reports mean and standard deviation difference between the measures
-good for when we allow a little leeway in judgment of fine measurements
-good for time and distance measures
-takes time to calculate
In terms of reliability of tests, what are the 4 issues?
-test-retest
-parallel or alternate form
-split half
-standard error of measurement
What is test-retest reliability?
Do you get same scores when tested at a later time?
What is parallel or alternate form reliability?
-Do the tests have a parallel/alternate form of reliability? (another version of the same test -- test has different pictures on 2nd version)
What is split half reliability?
-Is the 1st half of test same difficulty as 2nd half (unless it is meant to get harder)
What is standard error of measurement reliability?
How well can we predict where the person is on the normal curve based on their score?
Why is validity important?
shows that measures, procedures, tasks, etc. used in a study or in a test are accurate or a true reflection of what is measured/tested
What are the 2 ways to look at validity?
-validity of research outcomes and measures. Can we trust the results?
-validity of tests
True or false: Validity of a test has to be proven.
False, contents of the study speak for themselves. Others can challenge validity of the measures and interpretation of the results.
A strong rationale underlying the purpose and methods of the study improves _________.
validity
What are some questions that could be asked regarding a test's validity?
-did the dependent variable change as a result of the independent variable?
-are there alternative explanations to the results? (the conditions we put participants in is what gave us our data, not other variables)
What are the 3 major types of validity of tests?
-content
-concurrent or criterion
-construct
What is content validity?
-extent to which the items on the test or program represent the universe of items on a test (or some other item)
-the content of a Tx program (what does it involve?)
Is the evaluation of content validity usually subjective or objective?
-subjective, it is based on experts in a field of study
-test manual should state how content validity was determined
What is criterion or concurrent validity?
-how well a test estimates performace based on some criterion
-how well a test predicts performance
What is estimation of criterion/concurrent validity?
screening test estimates full range test
What is prediction of criterion/concurrent validity?
use one criterion to predict performance (eg. ACT, SAT -- predict how well someone will do in college)
What is construct validity?
-extent to which a test, a measurement device, or program measures a theoretical or hypothetical set of related variables (i.e. how well does it really test, measure, or produce what is supposed to)
What is construct validity in relation to tests?
-can be related to differentiating between who has a disorder/problem and who doesn't (eg. PPVT- how was contruct validity determined?-- receptive vocab is a measure of intelligence)
What is construct validity in relation to Tx programs?
-see if procedures match empirical data, theories, or current knowledge in the field (eg. teaching reading, treating articulation disorders)
True or false: construct validity is related to content validity.
True
What are the 3 aims of construct validity?
1. show who does and who does not have a disorder or condition, if what is being studied is concerned with assessment.
2. show that the content validity contains as much as possible of what is being studied (the more content, the stronger the test)
3. provide evidence that an assessment tool measures what it is supposed to measure or a Tx program changes what it is supposed to change
How do you test a test's construct validity?
look at the relationship (correlation) between the assessment tool you are using and another assessment tool that tests the same construct