• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/68

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

68 Cards in this Set

  • Front
  • Back

What is reliability

Measurement concerning consistency, dependability and reproducibility

What does an unreliable score imply

An unreliable score implies that the scores are affected by sources of error or measurement error.

Is it possible for a test to have total and absolute reliability

No. There is always some level of inconsistency aka internal inconsistency.

What part of a test does reliability measure

Reliability is only concerned with the scores. Not the test itself

What is measurement error

Measurement error is the degree to which there is variance in scores that is related to the measurement process to things that are not relevant to what is being measured

What does more measurement error mean for reliability

More meausrement error makes for less reliability so they are negatively correlated

What is a true score

100% pure and accurate representation of the client's skill or abilities in a test. This would occur if there were no errors. It is impossible to achieve.

What is the observed score

The observed score is what we actually get from a client after testing. It is a measurement of their true score plus the measurement errors.

What is the standard error of measurement SEM

SEM is the measurement of how much the observes score differs from a true score

What is a time sampling error

Time sampling error is seen with repeating the same test over time with a client.

How is intelligence tests affects by time sampling error

Time sampling error doesn't affect intelligence that much because intelligence doesn't change much over time

What kinds of test constructs are more susceptible to time sampling error

Tests that measure mood or achievement are susceptible to time sampling error as they are likely to change over time.

What is the carry over effect

Carry over effect is a time sampling error when there isn't enough time between tests (normally a day or 2). The first sessions experience can effect the scores in the 2nd sessions scoring.

What is the practice effect

The practice effect is when we are measuring something that can increase over time with practice (i.e. ability in math scores). This is seen when testing sessions are taken too closely together.


What is learning or maturation in regards to time sampling error

This happens when test taking sessions are too far apart and a client may change or grow as a person or respond to therapeutic interventions and cause change in their scores which affects reliability.

What is content sampling error

When test items or the content of the test are not good representations of the construct of the test.

What is the largest source of measurement error?

Content sampling error

What is interrater reliability error

When 2 different raters are needed for their subjective rating of a client, there can be differences in perceptive of different aspects of a client

What kind of quality should test items have

They should be clear and unambiguous.

How does test length relate to reliability

The longer the test the more reliable the test. So more rest items the better

What is a reliability coefficient

Ratio of the true score variance to the observed score variance

Does the reliability coefficient relate to a group of scores or an Individual score

Reliability coefficient relates to a group of scores. Not an individual score.

What is a perfect reliability coefficient and what does it mean

+1.00 the reliability of the test is a perfect measure of the actual true score that could be derived from a client. the higher the reliability coefficient the higher the variance in scores across test takers is due to differences in test takers.

what does a score of 0.35 in relability mean

it means that 35% of the variation in scores is due to the real differences in test takers and 65% of the score is attitrube to random chance or error

what is the test-retest measurement for measuring relaibility

test and retest the same person on the same test at different times. the correlation given is called the coefficient of stability

what ist he coefficient of stability

the coefficient of stability is drawn from the coefficient of the test retest method. it reflects the stability fo the score over time.

what sampling error is the test retest measuring for

the test retest method is measuring for time sampling errors. this can be effected byt he carry over effect (time between is too short) or the learning/practice/maturation/therapeutic intervention effects (time in between testing is too long)

wht kind of test would not work well with a test retest method

a test retest method would not work well for a construct that is not stable. like mood or emotional state. things that are stable (like intelligence) test well with the test retest method

what is the alternate form or parallel form relaiblity form?

the alternative form or parallel form is when we test 2 equivalent forms of a test at the same time. like testing an alternative form of the ASI (i.e. ASI lite) against the normal ASI test. the 2 tests must have the same content domain/construct being tested.

what are the 2 types of parallel/alternative form

simultaneous and delayed form

what is the simultaneous form of alternative reliability

both tests are given simultaneously, at the same time on the same day. this reduces issues with memory and practice a sseen in test retest issues.

what is the delayed administration of alternative/parallel forms

the delayed administration is when we give the 2 equivalent tests at different times.

what kind of sampling error is the alternative/parallell forms senseitive to?

content sampling error.

what kind of coefficient does alternative/paraellel forms give us when calculating for relaiblity

this is called the coefficience of equivalence.

what is the coefficient of equivalence

it tells us how good 2 forms of a test measure a construct.

what sampling error does delayed administration capable of detecting

delayed administration can detect time sampling errors as well as content sampling errors.

what is the limitation of alternate/parallel methods of looking for reliability

the issue with alternative/parallel tests as a means of detecting reliability is that the many tests don't have alternate tests as they are expensive and difficult to construct. an equivalent test must be equivalent in content, difficulty, and several other factors.

what is internal consistency reliability

internal consistency reliability is about the relations of the items in a test.

what kind of sampling error is internal consistency reliability concerned with

internal consistency reliabilty is concenred with content sampling error

how does the internal consistency reliability work

it measures a single item in terms of it relates to the overall test score.

what does a high internal consistency relability mean in terms of of a single item of the test

a high internal consistency means that a single item is homogenous or has a high uniformity in measuring the construct in question

what is a pro of using internal consistency reliability

the use of intenral consistency relaiblity involves the application of a test only once to gain the calculation so this is cost and time effective.

what is the split half relability

the split half relaiblity is a form of measuring the internal consistency reliability of a test. it is done so by splitting a test in half and then correlating the scores to the other half of the test when taken.

what coefficient is demonstrated by the split half reliability test

in measuring for th intenral consistency of a test and one applies the split half reliability measurment, a test is tested for the coefficienty of equivalence. as seen in alternate/parallel form measurements.

what is the spearman brown prophecy formula

the spearman brown prophecy formula is used to measure for internal consistency reliability as seen in the split half testing method as a measurement of the entirity of the scores between the 2 halves.

what is an advantage to using the split half method

using the split half method makes for one single test administrationg to be needed to gather the evidence which saves time and money.

what is a disadvantage of using the split half method

the split half method is sensitive to content sampling error, but it is not sensitive to time sampling error.

what is the kuder richardson formula

they are aka KR 20 AND KR 21. they are seen in measruments of internal consistency on tests that involve answers that are only right or wrong (forced choice form).

what is the coefficient alpha/cronbach's alpha?

measurement of internal consistency on questions that don't have a right or wrong answer (selection of choice form). it can be used on likert scales.

what is interrater reliability

it is the degree to which 2 raters may agree.

what kind of error does interrater reliability measure?

it does NOT make a measure of content nor time sampling error. it only makes a measurement of the agreement or disagreement between raters.

if a test was heterogenous due to measuring sevral concepts, what type of method would be best to detect reliability?

the split half test would be best for working with tests that test for multiple constructs. unless you split the tests based on constructs, then any other measure of internal consistency reliability could work.

if a test is considered what high stakes, what is the least amount of reliability coefficient that would be allowed

a reliability coefficient of 0.90 would be the lowest acceptable point of reliability for a high stakes test.

what is the acceptable reliability coefficient for an interrater agreement?

0.90

what is the standard error of measurement? SEM

measurement of a test score for a single person if the test was taken ad nauseum 

measurement of a test score for a single person if the test was taken ad nauseum

what is a low or unacceptable relability coefficient

0.59 or less (a F on a test)

what is a moderate or acceptable amount for a reliability coefficient?

0.60-0.69 (a D on a test)

what is the acceptable amount for a reliability coefficient

0.70-0.79 (a C on a test)

what is considered a high amount of reliabilty coefficient

0.80-0.89 (a B on a test)

what is considered a very high amount of reliability coefficient?

0.90 or more (an A on a test)

what is error distribution

since the standard error of measurment of a single score shows a single observed score as a measurement of the standard deviation from the true score, we can see that that a group of observed scores would cluster near to the true score with a mean of zero and an SD that would be that of the SD.

what would reliability mean for the error distribution

the more reliable the test then the more reliable or precise the observed score would be the true score, so less reliability would mean more error distribituion.

what type of measrument does SEM help influence

the standard error of measurement is a big influencer of the confidence intrval.

what is a confidence interval

we use this calculation to build interals around a single observed score to estimate the probability of where their true score would fall. this is the upper and lower limit of their true score

what is the 1 SD in terms of the confidence interval

1 standard deviation (-/+) is a 68% confidence interval. so we are 68% sure their true score falls within one standard deviation. or a z score of 1.00

what is 2 sd in terms of the confidence interval

2 standard deviations represents 95% of a confidence interval. or a z score of 1.96

what is 3(-/+) SD in regards to a confidence interval

3 (-/+) SD makes for a 99.5% confidence interval. or a z score of 2.58.

how does one increase the reliability/precision of a test x5.

increasing the number of items on a test. using unambigous items that are clear. select response items and not constructed response tiems. a balance of difficulty. clear instructions.