• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/34

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

34 Cards in this Set

  • Front
  • Back
What is interesting about methods used to calculate reliability?
Each of the methods used to calculate reliability uses a slightly different definition of true score and error
What is something to keep in mind about the reliability coefficient?
Always keep in mind that the reliability coefficient is an estimate that can change depending on the method used to calculate it
How do you choose a method to estimate the reliability?
The method you choose to estimate the reliability should fit the way in which the test will be used.
What is test/retest reliability used for?
Utilized to measure the stability of scores over a period of time.
How is test/retest reliability used? How does it work?
-A test developer gives the same test to the same group of test takers at two different times.
-Scores on the 1st administration are compared to scores on the 2nd administration using correlational techniques.
-If individuals respond consistently from one administration to the next, the correlation between test scores will be high.
What is one thing to remember about the interval between the administration of the two tests during test/retest reliability?
The interval between the administration of the two tests can be several hours or several years.
What are some common problems with test/retest reliabilty?
-You assume that the test takers have not changed over time- but they have.

-The magnitude of the reliability of scores evidenced changes as the time between administrations changes

-Practice (carryover) effects could influence observations.
Due to attribution of error – error becomes more systematic
What are Parallel forms of Reliability used for?
To eliminate errors.
Why do test makers practice parallel forms of reliability?
The reason for this type of reliability estimation is to develop an alternate test form that is equivalent in: Content, Response processes, Statistical characteristics, in order to account for some error
How do parallel forms of reliability work?
You are looking to achieve a “true score” for both tests. This comes from the fact that the tests are parallel in content and processes. If this is done correctly, this will result in identical distributions of scores for the two tests.
How do you eliminate practice effects and other problems with test-retest reliability?
Test developers often give two highly similar forms of the test to the same people at almost the same time.
-To eliminate practice or transfer effects, half the participants take one form followed by the other. The other half take the forms in reverse order- this is known as counterbalancing
What are some common problems with equivalence reliability?
Common Problems:
•Order effects-What order are you going to give the tests?
•Questions Order- can cue test takers to answer questions on the alternate form in specific ways.
•Creation: How do you make parallel tests? How do you make sure the tests are parallel?
What is something interesting about parallel testing forms?
The parallel test is only used for reliability purposes, and then discarded.
Internal consistancy?
examines the degree to which the items on the test are all assessing the same thing.
In parallel testing, how can you tell if a construct is consistent?
Consistent constructs are assessed throughout the scale.
What is an example of internal consistency?
•Think of the tape measure:
-The first foot is the same length as the second foot.
-The fourth foot is the same length as the eleventh foot.
*The tape measure has internal consistency because measures each foot as the same distance therefore the same scale is established
Internal Consistency is beneficial because:?
-It eliminates the need to administer a test twice
-There is no need to go through the difficulty of creating a parallel test
-Eliminates carryover/practice effects.
-Eliminates changes that may occur in people over time
How does split half reliability work?
-Administer a test to a group of individuals.
-Split the test into two halves.
-Each half is an alternate form.
Correlate the scores on one half with those on the other half.
-This correlation can be used to estimate the reliability of the test.
-Longer tests have higher reliability
How do you explain split half reliability?
We must make an adjustment to the correlation to determine the reliability
What do the variables stand for in the spearman-brown prophecy formula?
k = ratio by which your test length changes after splitting
(full # items you want / # items on which correlation was determined)
r = correlation between parts of the test
What does the Spearman-Brown prophency formula do?
-This formula adjusts the correlation based upon the number of items in the “new” split tests and the number of items in the original test.
-You can also use this formula to predict what the reliability would be if you wanted to change the number of items on the test.
What are some common problems with split half- reliability?
-We are not assessing reliability of the test scores, but reliability scores on half the test, and then estimating.
-It is assumed reliability will always increase given more items.-This is only true theoretically.
-How do you split the test?Which items?
-Are the items in the splits homogeneous or heterogeneous?
-Some “splits” give a much higher correlation than other “splits.”
What is Cronbach alpha (sometimes called coefficient alpha) used for?
Used for polytomously scored items
What is Kuder-Richardson 20 (KR-20) used for?
Used for dichotomously scored items
What is Kuder-Richardson 21 (KR-21) used for?
Lower bound of KR-20
Used when item statistics are unknown.
What were Cronbach Alpha Family of Internal Consistency Reliability Coefficients created for?
In order to get around problems with split-half, a new family of reliability coefficients were developed.

Instead of splitting a test in half, these methods compare every item to every other item.
What is good reliability according to (Nunnaly & Bernstein, 1994)?
.70 is acceptable for early stages of research.
Basic research generally requires .80
When important decisions are to be made, you should have a reliability coefficient of at least .90.
What is the most important thing to remember when interpreting reliability?
These are general guidelines. There are no gold-standards for interpreting reliability.
When is lower reliability okay?
-You are making preliminary decisions
-You are sorting people into groups
-You are conducting some research
-You are making a classroom test
--Classroom tests generally test diverse content.
What are some factors that influence reliability?
-Length of test
-Pressure of time limit
-Heterogeneity of the group taking the test
-Level of difficulty of the tasks required by the test
Why do we use standard error of measurement?
To better understand what reliability is we often use another statistic in conjunction with reliability.
What is the standard error of measurement?
-SEM is an index of the amount of inconsistency (or amount of expected error) in an individual’s score.
-Acts as another measure of variability in scores.
What is a confidence interval?
Range of scores that likely includes the true score
How can you affect a confidence interval?
-SEM can be used to construct a confidence interval (CI).
-CI can vary according to how confident you want to be.