• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/56

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

56 Cards in this Set

  • Front
  • Back

What is psychological measurement>

Measuring an attribute of a person's mind.

How can you obtain meaning in Psychological Measurement?

By comparing a person's score to other individuals.

What are norms?

Comparison data are called norms and should– for most psychological tests – have beenobtained as part of the process of ‘testconstruction’

What are norm referenced tests?

Tests or measures that require norms for the scores to be interpretated.

What type of testing doesn't need norms?

Criterion-referenced testing

What is criterion-measured testing?

AKA Mastery tests. When is it possible to specify exactly what we are measuring the test is meaningful in its self (like an exam)

When are norms not needed

1. When there is a 'standard' against which the measurecan be calibrated (e.g., metres, weight?)2. When there is an outside criterion against which themeasure can be calibrated (e.g., skills such as driving).

What are samples of norms called?

Normative or standardization samples

What do norms have to be for a norm-referenced test>

Appropriate and good.

What are good norms?

Are based on an appropriate standardization(normative) sample for the testing purpose

What are appropriate norms?

Relevant to testing purpose

• obtained from a large enough sample


• obtained using the same testing conditions across the sample

What are some cautions for interpreting norms?

-The normative sample is important.


Representative of whole population, for specific sample. Can be out dated quickly. Same samples can have a high error.

How do you sample norms>

Systematic or stratified sampling to represent the whole population.

How does sample size effect error?

The larger the sample the smaller the error because error=SD/√N

What is a good sample size?

1000 good-


2000 excellent

How do you decide on an appropriate normative sample?

Consider testing purpose


Cultural implications


Is a cut-off score being used


Norms should be relevant


Report any issues with norms

What are the scales of measurement?

Nominal


Ordinal


Interval


Ratio

What is transforming data?

Rescaling data to z-score, T-scores or Standard Deviation

What is a z-score

The transformation of a percentage or apercentile ranks

What are linear transformations?

Transformations which keep the distribution of the scores the same (ie: adding a constant to each score).

What are percentiles and stanines?

Different ways of representing a normal distribution.

What is reliability?

The consistency of tests scores (not a general property).

What is reliability about?

Consistency

Dependability


Replicability



What are the types of correlation?

Strong, moderate, weak


Positive, Negative, Curvelinear

What is a correlation coefficient?

Anassessment of the degree to which onemeasure co‐varies with another measureof the same thing/person/etc

What is pearson's r?

The average sum of the products of a pair of z-sores.


What is the equation for Pearson's r

r = Σ(zx zy)/N

• meaning multiply each X z score by its corresponding Y z score – then add up results ‐ then divide by the number of pairs

What type of correlation should be used for ordinal data?

Spearman’s rho (ρ)

What is the Classical Test/true Score Theory>

• One way to think of reliability is that a test score gives only an ESTIMATE of the “truth”

• Each time the test gives a “measure” it missesthe “truth” or has error

What is the equation for error?

Observed Score=Test Score + Error


• X = T + e

What are some factors which affect reliability?

Test construction - item sampling

• Test administration – person, place andadministrator factors


• Test scoring factors

What human factors affect reliability?

• Health• Fatigue• Motivation• Emotional strain• Test‐wiseness• External conditions of heat, etc

What are some admin factors which affect reliability?

-Bias in grading/performance


-Conditions of testing (adhering to limits etc)


-Interaction with examiner that facilitates orinhibits performance

What is Test‐Retest Reliability?

Give Test A and Time 1 then give Test A at Time 2 tothe same people and correlate results.

• Correlation gives the measure of reliability

What is a possible issue of Test-Retest?

Carry-over effects


• May not always be appropriate for psychological characteristics as…


• characteristic may change with time

What does test-retest measure?

Gives a measure of temporal stability

What is Alternate or Parallel Forms Reliability?

Give Test A – give Test B (an alternate form) to thesame people and correlate.

• Correlation gives the measure of reliability



When can the administration of Parallel Forms be close in time?

When carryover from one for to the other is not a problem when needing to minimise effect of time

What does Parallel/Alternative Reliability measure?

Gives a measure of content consistency and (whendelayed) temporal stability

• Not always possible or easy to have two forms of atest

What is Split-half reliability?

• Correlate halves of Test A ‐ such as odd/even etcgiven to one group of people

• Gives measure of internal consistency• Many ways to split test in half


• Result will reflect the degree of consistency of the twohalves of the test

What are some issues with Split-half reliability?

T test is half its length – scores will besmaller than for full test and so range of scores willbe smaller and so correlation will be smaller (range restriction)
What are the types of Internal‐consistency measures for reliability?
Kuder‐Richardson Formula 20 (KR20) and Cronbach’sAlpha.

What score does Kuder and Cronbach need to be consistent?

Around .80

What are some issues with Kuder and Cronbach tests?

-Only can test one construct

-provide only an overall score and gives noinformation on individual items.


-Sensitive to test’s length, with longer testsgetting higher scores

What are some facts about KR20Kuder‐Richardson Formula 20?

Devised 1937


Measures inter-item consistency


Standard of tests with dichotomous items


A split-half on all combinations of items

What are some facts about Cronbach's Alpha?
Describes how well a group of items focuses on asingle idea or construct

Finds average correlation of items



What does a high Cronbach's a tell us?

hiIndicates that theitems are focusing on one construct, but doesn’tsay how well that construct is covered.
What is Interscorer Reliability?
Scores ‐ judges – observers ‐ raters• various measures are used

‐ depending on thedata set


‐ such as ordinary correlations, othertypes of correlations, and percentage agreement

What is the Item‐Response Theory?

This examines the responses to the items interms of their ability to discriminate betweenaspects of test‐takers …

How can you measure the reliability of individual scores?

Standard error of measurement (SEM) gives an idea ofthe “test precision” for individual scores• The smaller the standard error the more precise the test

What is the equation for the Standard Error of Measurement?

For z-score SEM = √(1‐α)

• For raw scores SEM = SD raw scores√(1‐ α)


• Where α = reliability coefficient

How can you work out if the score is meaningful?

Standard error of thedifference between two scores
Testing reliability of compositescores
• Overall test scores are sometimes created byadding sub‐test scores

• The more highly correlated the individual subteststhe more reliable the composite score(i.e., combining tests is rather like lengthening

What are the sources of error?

• Test‐retest ‐

• Alternate form


• Split‐half


• Internal‐consistency

How reliable should a test be?

• This depends on many things ‐ use of test,expense, and so on.

• High reliability required when precisionimportant.

What are the actors affecting reliability?

1. Characteristics of people sitting test 2. Characteristics of test – including the length – Spearman‐Brown formula predicts the effect of lengthening the test 3. Intended use of scores 4. The method of estimation