Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
9-12

9-12

by emmychat, Nov. 2015

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/46

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

46 Cards in this Set

Front
Back

	test bias	different groups can score differently on a test- is it because the test is bias, or are certain groups actually different?
	BITCH and Chitling - alternative special test	while these tests can detect differences between groups, their psychometric properties have not been able to predict useful, real-world outcomes, such as job and academic performance.
	intercept bias (scenario two)	when the slope of the regression line is the same for each group, but they intercept the vertical axis at different places. (the test is biased but correctable).
	scenario one - group A do better than B on the selection test, this means that	either test is not biased, or the performance measure and the test score are equivalently biased.
	scenario 3 (slope bias)	the slope of the regression lines for the groups are different - the test is differentially valid for the two groups. The scatterplot shows more variability for group 2- the test is biased and is not predicative of performance of them.
	counter arguments to race discrimination and testing	1. pygmalion effect 2. minority group members still face many disadvantages, especially where group membership is superficially obvious (black children raised in white families still tend to perform worse at school) 3. construct of race is primarily social and has no biological meaning, especially in Aus and US.
	Pygmalion experimental design: 1966	children given a non-verbal IQ test, teachers were told the child was either spurting or blooming. Teachers were given a list of the children who performed in 20% in this test. In reality children on this list were chosen at random, all children tested again at the end of one year.
	pygmalion effect	school achievement is influenced by teacher expectation. For the earliest grades, students who were identified as the bloomers scored significantly higher at the end of the year.
	Stereotype affects test performance GRA	Group 1 told test measured IQ, group 2 told test measured problem solving skills. African Americans did worse in group 1, but not in group 2 (where they did not differ).
	Stereotype affects test performance	when asian women were primed about their racial identity before a math exam, they performed better than controls. When women primed about their gender before a math exam they did worse than controls.
	disability discrimination act Australia	all tests must measure person for the requirements of the job, and not the person in the abstract. Tests should assess the suitability of an applicant for that specific position, based on selection criteria (information such as person's private life or personality should not be used when making decisions about applicant).
	Australian Industrial Relations Commission vs. Coms21 1999	Coms21 hired recruitment consulting to make decisions about a structural re-organisation. Drake provided 5 individual personality profiles/ psychometric reports before firing the people.
	fair work commission	handles workplace discrimination issues
	law and psychological testing	rulings are inconsistent and commonplace,
	test utility	practical usefulness of a test (usually in terms of financial cost/benefit) even if a test is high in validity and reliability, (a test does not have high utility if it costs 1,000,000 to administer) doesn't mean it is practical to use.
	example of a test with poor reliability and validity but good test utility	when test score is less important, e.g., lie detectors, useful if participants think they work. But don't use tests low in reliability and validity to interpret a test score.
	utility analysis (statistical decision analysis), helps answer two questions:	different techniques used to decide the usefulness of a test/s: 1. which test shall we use? 2. is adding a new test to an existing battery of tests worthwhile?
	expectancy table, utility analysis	1. false positive: test incorrectly identifies person as good when they are not 2. false negative: test incorrectly identifies a person as being no good when they are good. 3. selection ratio: ratio between available job positions and number of applicants 4. cut off: minimum test score needed in the test to be hired
	utility analysis: if selection test had low criterion validity:	test is no better than selection applicants at random
	limitations to expectancy tables:	assumes a linear relationship between job performance and test score, and doesn't take into account other characteristics of the applicant: physical health, minority status
	when to apply Signal detection theory	need to apply SDT whenever you have a task that involves discriminating between two stimuli (such as the recognition memory task)
	four possible outcomes	correct hit false positive correct miss false negative
	sensitivity and response bias	ability to discriminate between stimuli, and criterion for saying yes - if you just looking at sensitivity then your results are confounded with response bias
	how to disentangle sensitivity from response bias?	look at false positives as well as correct hits (i.e., hit rate - false positive rate)
	d' (d prime)	1. The closer d' is to 1, the better the person is at discriminating between stimuli, 2. If it is 0 then the person is guessing 3. If it is negative, the person is recognising words they didn't see, and not recognising words they did see.
	industrial inspection, response bias or sensitivity issue?	decrease in hit rate was due to response bias
	what strategies have been put in place that detect people cheating on the hazard perception test by clicking all over the screen?	1. allowing the measurement of false positives 2. stricter definitions of what a hazard is 3. using a modified task: hazard change detection 4.
	Item response theory	superior alternative to classical test theory, it's score is called Theta: a function of the examinee's response interacting with the characteristics of the items.
	characteristic curve	a plot of ability (level on some trait) vs. the probability of getting a particular question right (item-difficulty index)
	the three parameter model	1. item difficulty: level of ability needed to get the item right 50% 2. item discrimination: point where slope is steepest 3. level of guessing
	advantages of IRT (4 advantages)	1. no need to calculate an overall total score on a test to estimate someone's ability level 2. people don't need to complete the entire test to get an idea of their level of ability 3. much better for computerised adaptive testing (no need to give people items that they will definitely get wrong) 4. can compare people even when they completed different parts of the same test
	disadvantages of IRT	complex to understand software is still in its infancy requires large samples to get stable estimates requires more assumptions than classical test theory
	why is the accuracy of diagnostic tests often overestimated?	people forget to consider the base rate (likelihood of the disease occurring in the population)
	why do many diagnostic tests adopt a liberal response bias?	to maximise the number of correct diagnoses, better to minimise false negatives even if that results in an increase of false positives. just because a test has a high number of correct diagnoses doesn't mean it's any good at diagnosing (have to consider false positives)
	ROC curve (receiver operating characteristic, developed in WWII)	a plot of correct positive rate (sensitivity) versus false positive rate (1 - specificity).
	the pass mark on the ROC curve that gives the most accurate classification is the:	point on curve when the sum of sensitivity and specificity is highest.
	Thompson effect	a visual illusion: an image reduced in contrast looks like it is moving slower
	cactaracts affect 50% of people over the age of 75, how much more likely are they to crash?	2.5 times more
	method of constant stimuli	give people a bunch of trials and vary each scene in what you are testing (e.g., speed) show these scenes in a jumbled order and get people to make judgements on these scenes that come in pairs.
	in psychometric functions what does a steep line tell us?	the better people can tell apart the speeds.
	what does the point where the line crosses the .5 mark tell us?	tells us if there is any systematic speed bias
	utility analysis, an example.	new test had higher validity (.76) than the old non-test procedures (0-.5) and saved the company around 6 million dollars a year in 1979.
	what has signal detection theory analysis found in regards to jury decision making?	that the sort of instructions that are given to jurors regarding the definition of 'reasonable doubt' affects their response bias (willingness to convict) rather than their sensitivity (ability to distinguish guilty from innocent defendants).
	what is the main point of item characteristic curves?	to see how each individual item interacts with level of ability
	computerised adaptive testing	select items to administer based on what the participant previously got right.
	what can we do about the Thompson effect in terms of driving safety?	1. remove cataracts faster 2. increase contrast in road environment 3. give drivers perceptual training to improve their ability to tell different speeds apart

Share This Flashcard Set

Set the Language

Related Flashcards

9-12

Add to Folders

Upgrade to Cram Premium

Card Range To Study

46 Cards in this Set