Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
assessment test 3 (year 5 term 1)

Assessment Test 3 (Year 5 Term 1)

by fraser2017, Nov. 2022

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/197

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

197 Cards in this Set

Front
Back

	Reliability	Measures the amount of random or nonsystematic error present in a test
	Part of error Resides in the test situation	Who gives the test (examiner effects) Who takes the test (test taker characteristics) Item effects
	Relationship between examiner and taker	Familiarity with and rapport between examiner and test taker can increase test scores People report more health concerns in response to interview questions delivered online, or by self report or telephone than direct questioning Face to face questioning can lead to expectancy effects particularly when questioning children or vulnerable people.
	Effects of tester race and gender on test scores (Great truth)	There is little evidence that tester race or gender influences test scores on individual or group administed ability tests Belief in such effects is myth and is unsupported by data
	Why does race and gender not matter	Test administration guidelines are very specific for most ability tests and training for administration of these tests is often done When effects are found there is usally a deviation from administration given in the manual -effects are small, insignificant
	Rosenthal effects	Experimenter expectancy effects
	Rosenthal effects (experimenter expectancy effects)	Expectancy effects are real but the overall effect on test scores is small Not clear is expectancy effects can be replicated in the manner they erre found in Rosenthals early studies Whether expectancy effects happen in standardized testing is unclear -when present small
	Responses after a correct or incorrect	Inconsistent feedback can reduce reliability of test results Results are inconclusive with some showing increase scores after praise and others showing little effect Reinforcement does alter responses on attitude surveys -frequently increasing yea saying responses
	What to do after a response	After a response no response should be given to avoid the possibility of reinforcement or random reinforcement What to say or not say is outlined in the test manual -there are no exceptions to following standardized testing procedures
	Advantages if presenting instructions and items on computer (computer administered tests)	Complete standardization Branching and adaptive testing is possible Precise timing Self paced presentation and response Complete randomization of question presentation
	Difference between computer and test administration	Little evidence of score differences between the two administration methods Reliability is comparable with both
	Responses and feelings about computer administered tests	Rather than feeling alienated or frightened, most test takers find the interaction enjoyable This may not be true for disadvantaged test takers who have little exposure to technology People may be more likely to respond honestly to personally sensitive questions on a computer than on self report interviews
	Computer adaptive testing (CAT) procedure Also known as branching	All tesr takers start with the same set of questions of moderate difficulty Program presents harder or easier questions depending on how these initial questions were answered Test takers spend little time on questions that are too hard or easy Program presents items based in test takers skill level until a predetermined number if items have been answered incorrectly (test is then over)
	Computer adaptive testing (CAT) advantages	Provide a better profile of a person in a short period of times Test scores can be given almost immediately
	Behavioral assements include measures of	Job samples in which ratings are made of ongoing job related activities Ratings of children's in class or playground behavior Ratings of psychiatric patients before and after tearment Ratings of on going social interactions
	What is being measured/ effects are we looking at in behavioral assessments	In all beahviour assessments, a rater is evaluating someone else's beahvior using some form of evaluative scales or dimensions Both the test and reliability of it is based in tester
	What is the main concern/ issue in behavioral assessments	Reliability of the ratings from raters
	Reliability of ratings/ raters	Reactivity -When raters know their ratings will be evaluated, ratings are more accurate than when not being checked
	Overcoming reactivity	Suprsie spot checks are made from time to time Kappa coefficients are needed to assess interrater reliability (agreement between raters) Interrater reilaiblity needs to be assessed during the actual observation periods not training
	Training of observers (raters)	Behavioral ratings require that observers be extensively trained on what to observe and how to code what is observed -need to generate a code boom to train raters what to look for and how to evaluate -anchor (everyone knows what a 1 or 5 means)
	Drift	In the field actual ratings may depart from ratings done at training and become unique to the rater These unique individual ratings can take many forms
	Forms that unique individual ratings of the rater can take	Inconsistency effects -same beahviour is rated differently on different occasions -reliability is gone Shifting standards -ratings of the same behavior differs across people Group standards effects -when groups of raters are observing, they may adopt a informal, implicit rules for observing -unless known they don't match to trying and reialbilty is gone Contrast effect and assimilation effects -behavior rated differently depending on preceding beahviour
	Overcoming drift effects	Periodically retraining on yye original coding format is often necessary Best way is to video tape tge whole thing and then evaluate the tape with raters
	Interview	One person asks another a series of questions believed to be diagnostic of a quality or attribute the interviewer is trying to assess Best known technique for assessment of individual differences
	Landy (1985)	Companies interview between 5-20 people for each hire Given the costs in time, effort and resources for companies the utility, validity, and reliability of interviews have been studied
	What is the purpose of interviews	Is to ask questions that reveal whether or not the interviewee has the skills, ability, interest and motivation to do the job, profit from additional traning , be of benefit to organization, enjoy the new position, get along well with co workers These concerns are about predictive validity
	3 forms of interviews	Structured Semi structured Unstructured
	Structured interviews	Everyone gets the same questions in the same order from a panel -in essence an orally administered questionnaire
	Semi structured interview	A patterned or guided interview covering certain prefeterminided areas of interest
	Unstructured interview	Nondirective depth interview where the interviwer sets the situation and encourages interviewee to tall as freely as possible
	Interviews gather (observe) different types of information	Observation of a limited sample of behavior such as speech rate, language usage, poise, reaction to being in an unfamiliar situation, nervousness, style of dress, posture It is empirically unknown whether any of this collected information from interviews relate to or predict the criterion
	Interviews can elicit information that may predict the criterion It is belived	Is often claimed that what a person has done in the past is a good predictor of what they will or are likely to do in the future
	The claim that past is a predictor of future behavior is true if	1) the situation and person remain stable 2) the person's interpretation of their behavior in that or similar situation remains constant
	To get useful information or the effectiveness of an interview rests on what	Ability to collect useful information from an interview rests solely on the skill of the interviewers -their skill to ask the right questions and to correctly interpret the respondents answers (Interviwer are the test)
	Interviews can go wrong in predicting outcomes when	Respondents or interviewers conceal important information Important questions were not asked Information was not correctly interpreted Interviewer is insensitive to cues in interviewees behavior Interviewer is inattentive to information that was reported
	Why is sensitivity to responses important	Sensitivity to what was stated and how it was stated may lead to further probing, learning new information and qualifying previous answers
	Different types of interviews include	Employment interviews Mental status exams Clinical interviews
	What is an employment interview	Is the most frequently used pre employment assessment done by organizations Can vary along the dimensions of traditional to structured
	Traditional interviews	Where a number of different areas are discussed with each job applicant Serves to acquaint the applicant with possible work colleagues and the work enviroment
	Structured interviews	Standardized, with each applicant receiving the same questions and responses are scored using a scoring format
	Reliability and validity of traditional unstructured interviews	Traditional unstructed interviews are often invalid and are unreliable predictors of future work performance
	Hunter and hunter (1984)	Found a predictive validity coefficient of 0.14 between the Interviewer judgments during a traditional interview and future job evaluations and performance Reasons for low validity are not hard to find Results from these interviews say more about the Interviewer than interviewee
	Traditional interviews have low validity because they are	Plagued with age, gender and attractiveness stereotypes Halo and horns effects Do not address the major concerns of the organization
	Halo and horns effects	a form of rater bias which occurs when some walks in and projects competence or incompetence in one area, and the supervisor rates the employee correspondingly high or low in all areas
	Negative search strategy	Search for any negative information that would disqualify an applicant Any negative information will be enough to reget applicants unless demand is high enough and few workers are available When impressions are favorable (halo effect) the rejection rate drops to 25% (Most interviews operate on negative search)
	Webster 1964	Found that one unfavorable impression was enough to sink the applicants chances in 90% of cases
	What constitutes as negative information	Poor communication skills Lack of confidence or poise Low enthusiasm Nervousness Failure to maintain eye contact (These are all signs of introversion and social anxiety)
	What constitutes as postive information	Ability to express oneself Self confidence Poise Enthusiasm Ability to sell ones self (All signs of extroverson, assertiveness and social skills)
	What causes a good first impression (tipping the balance in your favor)	Looking professional Well groomed Project an aura of competence and expertise Nonverbal cues that imply friendliness and warmth
	Structured job interviews	Address issues of reliability and validity that are raised by traditional interviews Change from interpersonal relationships to job focused questions
	Structured employment interviews uses questions that are	Job focused Pre-planned Presented in the same order for all Answers are scored according to a predetermined scoring procedure
	Interviewers in structured job interviews	Interviewers are trained on how to ask questions, how to take notes, and score answers Procedures standardize interviews for each candidate
	What do structured interviews focus on	Focuses on the relationship between past behavior and current and future beahviour Linking past, present and future beahviour provides better predictions for future behavior Traditional interviews focus on questions that assess attitudes, opinions and interpersonal dynamics
	In structured job interviews all job seekers are asked to	Provide specific examples of beahviours they have used in the past Provide examples of what they would do under specific circumstances -answers are rated on behaviorlly anchored rating scales
	What interview style are used most often	Many companies and organizations use traditional employment interviews So standard questions result in standard answers
	Successful candidates and turnover	People do not have a great track record when it comes to identifying successful job candidates Harvard business review points out that 80% of employee turnover is due to bad hiring decisions Hiring is difficult and mistakes are expensive
	Society for human resource management reports that	36% of new hires faol withing the frist 18 months 40% of senior managers hired from outside the organization fail within 18 months It's costs on average one third of a new hires yearly salary to replace them
	Reasons why so many hires fail reflect	Improper gathering of information during the interview Improper analysis of information Improper interpretation and intergration of data Implicit reliance on stereotypes and halo/horn effects, heuristic reasoning (All play key roles in thr success or failure in selecting goof employees)
	Mental status exam	15- 20 minute interview that an intake worker assess the likelihood of brain damage, drug or alcohol problems, psychosis, and other major mental and physical health issues
	What is the purpose of mental status exams	Purpose Is to assess neurological or emotional problems in terms of variables known to be possible causes
	What is noted in mental status exams	Patients appearance, behavior, speech, perception, thoughts, attitudes and general appearance are noted
	What is assessed in mental status exams	Emotional states -flat affect (little fluctuation in emotion) -emotionsl inappropriatrness -emotional liability Intellectual functioning -speed and accuracy of thinking -richness of thought content -memory capacity -judgmental accacy -proverbs tests Attention processes -level of distraction -perseverance -presence of hallucinations -delusions
	What do emotional, intellectual, attention and thought problems associated with	Are markers of schizophrenia, drug dependency, anxiety disorders and brain disease
	What do the results from Mental status exams tell us	Tentative diagnosis Likelihood of injury to self or others Outcome of Psychotherapy
	How to do a mental status exam competently	A complete understanding of the major mental disorders is required: -Thorough knowledge of various forms of brain damage -thorough knowledge of neurological impairment -thorough knowledge of the DSM - 5 coding system (Not one fixed exam)
	Clinical interviews cover same ground as mental status exams but can be broader and also explore	Job prospects Career alternatives Self knowledge Information to make more appropriate life choices Therapy and therapeutic related outcomes
	What is the the purpose of clinical interviews	Task is to obtain important information for the person but what is important depends on the nature and purpose of the interview
	Clinical interviews can be broad or narrow depending on	Nature of the referral question Nature and quality of the background information Time demands Concurrent clinical judgments
	Interviewers in mental status exams and clinical interviews	The tone, interview climate, and answers elicited hinge on the beahviour of the interviewers
	Research on clinical judgments	Just how accurate are individuals or panels of individuals in -synthesizing and integrating information about another person -arriving at a correct decision -how accurate are clinical diagnosis, judgements, evaluations made by single judge or panel
	What types of information and outcomes are used	Judges or panelists have a number of scores of information on the person being interviewed -test scores -test score patterns -interview results -famkly histories -medical information -biological information -school records
	Studies on how acutrate judgments by judges ask	Given the wealth of information available to decision makers, how accurate are evaluators, teachers, clinicians, coaches in predicting outcomes Evaluators are the test instruments -the validity and reliability of evaluations made by evaluators becomes an issue
	What are actuarial methods	Involves converting all available background information into numbers and entering all of the information into regression equations Method allows your tk see what information are predictors that best predict an outcome
	Cleary model	Which is another name for actuarial methods Are often constructed with clinical judgments in which evaluators arrive at a judgment, evaluation or determination of a particular case
	Who comes up with a better prediction -a regression equation or humans integrating the available data	A regression model often excels or equals diagnoses, predictions, judgments and evaluations made by individuals or panels
	Why are evaluators reports not better than regression	Evaluators or judges reports of how they combine and weight data bear little relationship to how -information is actually combined -the weight or importance attached to that information (Orgian and rationalization issues) -don't know where results come from but come up with reason
	Sarbin (1943) combined high school counselors predicton of grade 12 students sucess in college against the accuracy of regression equation	Counselors used college aptitude test scores, grades interview results, scores on vocational interest inventory, personality test scores, post high school interviews Regression equation only used college aptitude test scores and high school grades and predictors Counselor prediction came close to regression equation predictions for girls but regression did better for boys (Regression did better overall)
	Meehl (1954)	Asked if clinical judgments were better predicts than regression equations Judges were clinical psychologist, counselors, teachers, clinical social workers, other professions in varying degrees of education and work experience
	What did Meehl find	Found that with Few exceptions (administrated assistants) , actuarial methods yielded as many correct and frequently more correct predictions as did clinical analysis given by professionals
	Meehl 1965 -repeated the study but included 50 new clinical outcome studies	67% of studies favored statistical prediction 33% showed no difference between clinical and actuarial judgments
	Goldberg (1965)	Reported that statistical predictions from MMPI profiles predicted future mental health status better than did clinical predictions
	Grove (1996) -meta analysis of 136 studies that directly compared clinical and actuarial judgments	64 studies (47%) actuarial methods predicted better than did clinical judgments 64 studies (47%) showed no difference -clinicians had the advantage because more information was aviable to them to use than actuarial predictions 8 studies (5%) favored clinical judgments
	Enhancing clinical predictions Neither...	-The amount of clinical experience -The number of years of professional training Enhanced predictive accuracy over regression equations or the use of mechanical prediction rules In mechanical prediction, weights are given to each predictor on the basis of past outcomes
	Why are clinical judgments sub optimal	Fail to realize that diagnostic cues are probabilistic not absolute categorical cues for outcomes Fail to account for cultural, sub, cultural or gender differences Use of racial or gender stereotypes when making judgments Use illusory correlations rather than decision rules -looks like a relationship but not Overuse and rely on inaccurate predictive principles -ignore bass rate information, regression to the mean or use of too many correlated predictors
	The most important reason why clinical judgments sub optimal	Use intuition, emotion or gut feelings when making judgments or interpreting information
	What is regression to the mean	What happens when an event happens and u can assume that it will continue to happen and become shocked when it doesn't What you say is the outlier
	Is information intergration or gathering the issue in clinical interviews	In Clinical interviews data synthesis (intergration) is the issue not information gathering
	When are Clinical interviews better	Clinical interviews are better than actuarial procedures for obtaining information ok infrequent behavior
	test biases	Refers to questions concerning serveral issued such as -item fairness -comparable prediction scores across groups -or the construct validity of the of the test across groups
	Concern with test biases	The concern with test bias arises when test characteristics detract from the construct measurement -it does not refer to attributes associated with test takers
	How does the standards define bias	A bias is a systematic error in a test score A biased assessment is one that systematically under or over estimates the construct it is designed to measure Bias exists in the test not the people
	When is a test not biased	If an achivment test produces different mean scores for different ethics groups but there are actual true score differences between groups then the test is not biased
	When is a test not biased	If an achivment test produces different mean scores for different ethics groups but there are actual true score differences between groups then the test is not biased
	When is a test biased	If the observed differences in achievement scores are the result of the tesr underestimating or overestimating the achievement of a group then the test is culturally biased
	The concept of test bias focuses on what question	Questions about the interpretation of the validity of the test score (Test performace)
	How can tesr bias affect test takers	Test bias refers to systematic error in the estimation of some true value for a group of individuals construct over or under representation and construct irrelevant components may affect the performance of different groups of test takers
	Most controversial finding in Psychology	The persistent one standard deviation difference between the intelligence test performance of black and while students -15 standard score points
	Cultural test bias hypothesis (CTBH)	Any difference between gender, ethnic, racial or group performance is due to biases
	Any gender, ethnic, racial, or group performance difference on mental tests can be attributed to	-Inherent artificial biases produced within the test through flawed psychometric methodology -group differences are believed to stem from test characteristics -group differences are unrelated to any actual differences in the psychological trait, skill or ability in question
	Cultural loading	Refers to the degree of cultural specificity present in the test or items Can be culturally loaded without being culturally biases -the greater cultural specificity the greater likelihood of the item being biased when used with individuals from other cultures -all tests in current use are bound in some way by their cultural specificity
	Mean difference hypothesis	Mean level differences in performance on tasks between two groups are believed to constisute test bias Asserts that there is no valid scientific reasons to believe that performance levels should differ across racial, ethnic, gender groups Tests that demonstrate differences are biased -this is not correct as there are no prior basis for deciding that differences don't exist
	Thinking of mean difference hypothesis	Require that the distribution of test scores in each population be identical prior to assuming that the test is nonbiased regardless of its valdidty Portraying a test as biased regardless of its purpose or the validity of its interpretations suggests poor understanding of the construct being assessed and issues of bias
	Jensen (1980)	Discusses The Mean differences as bias definition in terms of the egalitarian fallacy Difference with regard to any aspect of the distribution of mental test scores indicate that something is wrong with the test
	Egalitarian fallacy	The idea that all human populations are identical on all mental traits or abilities
	Berry and Annis (1974)	Temmne live in a vertical world and inuit live in a horizontal world Subject to vertical horizontal illusion
	Features od a test that indicate fairness	Interpreting test scores Minimizing error in test presentation and scoring Enhancing test validity Accommodations for those with disabilities Writing appropriate items Evacuating potential job candidates through standard criteria for all
	Irrelevant factors in fair tests	Factors irrelevant to the construct Are eliminated during assessment to help ensure that the construct is measured in a way that is impacted only by knowledge, skills or abilities relevant to the construct itself
	Why are other definitions of tesr bias in CTBH or cross group test validity unacceptable as a scientific perspective	The imprecise nature of other uses of the term makes empirical investigation and rational inquiry exceedingly difficult Other uses of the term invoke specific moral value systems that are the subject of intense emotional debates that do not have a mechanism for rational resolution -emotional appeals, legal adversarial approaches and political remedies of scientific issues are not scientifically unacceptable and not useful
	Once mean group difference are identified there are 4 common explanations for these differences	The differences primarily have a genetic basis The differences have an environmental basis The differences are due to the interactive effect of genes and environment Tests are defective and systematically underestimating the knowledge and skills of minorities and leads to Differential validity (CTBH)
	Unfairness as a measurement bias	When test items are unrelated to the intended construct, it can result in test score differences across subgroups
	What is Differential item functioning	Differences in the functioning on test scores between defined groups Indicates that individuals from different groups who have the same standing on the construct being measured do not have the same expected test score Happens when test takers of equal abilities do not have the same probability of answering a test item correctly test item correctly Leads predictive bias
	Differential item functioning needs what	An indication of Differential item functioning must be accompanied by a suitable explanation for for Differential item functioning to justify an item as biased
	Predictive bias	Differences exist in the pattern of associations between test scores and other variables for different groups, causing concerns about bias in the inferences drawn from the use of test scores
	Fairness	Fairness is concerned with the validity of interpreting individual scores for their intended uses -unfairness means that the test score interpretations are invalid for the intended uses
	To have fairness	A individuals need to be treated as similarly as possible (important aspect of fairness) Important to take into account the individual characteristics of the test taker and understanding how there characteristics may interfere with contextual factors of the testing situation and the interpretation of test scores
	What are the major issues with giving achievement tests to minority groups	Inappropriate content Inappropriate standardization samples Examiner and language bias Meaurment of different constructs Differential predictive validity Qualitatively distinct aptitude and personality Inequitable social consequences
	Inappropriate content	Black and other minority children have not been exposed to the material involved in the test questions Tests are geared primarily toward white middle class homes, vocabulary, knowledge, and values Inappropriate content makes the tesr unsuitable for the use with minority children
	Inappropriate standardization samples	Ethnic minorities are underrepresented in standardardization samples used in the collection of normative reference data Inappropriate standard samples makes the test unsuitable for use with minority children
	Examiner and language bias	Because most psychologist are white and speak English, they may intimidate black and ethical minorities Examiner race and language use biased test results Biases happen because examiners are unable to accurately communicate with minority children and are insensitive to ethnic pronunciation of words on the test
	Measurement of different constructs	Tests measure different constructs when used with children from other than middle class culture on which the tests are largely based Not a valid measure of intelligence in minority groups.
	Differential predictive validity	Tests measure constructs more accurately and make more valid predictions for individuals from the groups that tests are mainly based on than other groups
	Qualitatively distinct aptitude and personality	Majority and minority groups have different aptitude and personality triats So test developers should begin with different definitions for different groups Helms argued that European and African values and beliefs are different which effects responses
	Inequitable social consequences	Due to educational and psychological test biases, minority group members are already disadvantaged in educational and vocational markets because of past discrimination, thoughts of inability to learn and are disproportionately assigned to dead end educational tasks Represent the inequitable social consequences of biased testing
	What is a biased item	An item is biased when it is demonstrated to be significantly more difficult for one group than another -test items must be unidimensional (all items must be measuring same factor) -items identified as biased must be differentially more difficult for one group than another -in this definition groups will have different mean test scores but group differences must be reflected on all items an in an equivalent fashion across items
	How to determine biased test items	Number of statistical techniques with many based on item response theory and are used to detect Differential item functioning
	Research results on biased items	Very little bias in tests at the level of individual items Some biased items are nearly always found accounting for more than 2-5% of the variance in performance For every item favoring one group there is an item favoring the other group
	Similarity amoung biased items	Very little similarity amoung biased items has been found Poorly written, sloppy and ambiguous items tend to be identified as biased items with greater frequency than items encountered in a well constructed standardized instruments
	How to eliminate biased items	Expert panels of minority psychologists are asked to indicate which items would be too difficult for minority or disadvantaged individuals Items that are seen as culturally biased by the panel are removed
	Use of expert panels show two consistent findings	Expert judges were no better than chance in choosing test items that minority children scored lower than whites Judges are not able to detect items which are more difficult for minority children and the ethnic background of the judge makes no difference in accuracy of item selection
	Methods used for the internal analysis of test items (item biases in construct measurement)	Factor analysis across groups Correlation of raw item scores with age . Comparison of item total correlations across groups Comparisons of parallel forms and test retest correlations
	Comparative item selection (Reynolds 1998)	Multiple retesting of item sets across groups Unbiased tests will show a 90% overlap rate between tests Biased tests and tests with low reliability will show low overlap Need large samples for stable results
	Bias in construct measurement	Construct measurement of a large number of often used assessment instruments has been investigated across ethnicity and gender with a divergent set of methodologies No consistent evidence of bias in construct measurement has been found in the many prominent standardized tests investigated
	Psyvholgocial tests	Function and are measured in the same manner across people from diverse ethnicities and gender Tests appear to be inbiased for the groups investigated and mean score differences do not appear to be an artifact of test item bias
	What is the recommended method for detecting item bias	Item response theory followed by a logical analysis of item content These methods are used to determine the degree of Differential item functioning (see if items function differently across groups by model parameters associated with the items)
	Item response theory models have various item model parameters that describe the item beahviour (three parameter model)	1) item difficulty (most important) -the point on the difficulty level of the latent trait at which the examinee has a 50% chance of correctly answering item 2) discrimination power of the item (slope) 3) guessing parameter
	Rausch model	Single parameter model that models item difficulty
	Using Item Response theory to determine Differential item functioning	Compares the item characteristics curves of two groups to create a Differential item fucnti9ng index Various statistical methods have been developed for measuring the gaps between item characteristic curves across groups of examinees
	Partial correlation analysis	Simple but less precise way to determine item bias Test for differences between groups on the degree to which there exists meaningful variation in observed scores on items not attributable to the total test score Mewningfulness is based on effect size which is obtained by coefficient of determination Need to be attentive to experimenter wise error rates
	Biases when useing tests to predict future outcomes are constrained by two problems	Biases in the measurement of the criterion/outcome Correlation between predictor and criterion is limited by the poor measurement characteristics of t he criterion -sqrt of validity is maximum -from the standpoint of the application of aptitude, achievement, and intelligence tests in forecasting probabilities of future performance, prediction is the most crucial use of test scores to examine
	Predictive accuracy can be determined in a few different ways	An item analysis can determine if items function the same in all groups (no criterion) Assess unstabdardized regression weights (slope) to see if weights are comparable across groups Differences in group averages on the test and averages on the criterion Exaime cut off scores separated by group and assess differences
	Job performance tests	Tests which are similar to actual job performance show little diversity across groups (Little biases) Biases arise when inferences are made on the basis of test results to behavior unrelated to the test
	Regression equations	Regression equations are used to assess biases in prediction Predictions take tye form of y=ax+b where a is the constant and b is coeffirnct An unbiased test requires errors on prediction to be independent of group membership and the x y regression line must be the same for each group -error around regression line should be similar
	Homogeneity of regression across groups -simultaneous regression -fairness in prediction	When regression equestrian for two groups are equivalent (the prediction is the same for those groups) When homogeneity of regression across groups does not happen then seperate regression equations should be used for each group
	Cleary model	The use of a single equation to make predictions from test scores Refers to the use of regression weights (slope) to predict job success or outcomes
	Clinical use of regression equations	In Clinical practice regression equations are rarely generated for the prediction of future performance Rather some arbitrary or statistically derived cutoff score us determined failure if below -usally based on clinical lore or past practices 2 SD below test mean is used to infer a high probability of failure in school performance
	Cut off criterion	Using cutoff scores, clinicians are establishing prediction implicit (implied) regression equations about mental aptitude that are assumed to be equivalent across race, sex ect
	Gorden 4 types of test bias	Case 1 -groups A and B differ on the test but have identical slopes relating the test and criterion -example of homogeneity of regression across groups Case 2 -there are different test scores, different slopes and different intercepts meaning different test validities from the two groups Case 3 -similar slopes but minority receives higher criterion scores than majority Case 4 -similar slopes but majority receives higher criterion scores than minority
	Interpreting Case 1 and 2	The issue is Differential test valdidty for the two groups In both cases the regression slope and intercept are examined -the slope of the line (regressing weight) is the correlation between test score and criterion -the correlation is the predictive validity coefficient If the test has significantly different validity coefficients for one group compared to another then slope bias and Differential test validity is present
	Common issue in case 2 Differential validity issues	There are often many more test takers in the majority than minority group This means that the regression weight will be significant for the majority group but not the minority and because of sample size differences, between group comparisons will be significant In such cases the test will show that the test is more suitable for the majority then the minority group (discriminates against minorty)
	Hunter and Schmidt 1997	In a review of 866 black white predation comparisons There was no evidence for the Hypotheses of Differential or single group validity with regard to the the prediction of job performance across race for whites and blacks
	Great secret truths of differences	Large scale industrial samples, tests on armed services personnel, school division wide testing all typically fail to find significant differences in validity coefficients Validity coefficients for nationally administered tests typically fail to find differences in validity coeffiecnts between rail groups In terms of predictive validity, abilityit test are equally valid for minority and majority groups in predicting occupational and educational outcomes When sample size and composition are comparable and the test and criterion are properly constructed, no slope biase is reporter
	Why is there slope bias in well constructed tests	Performance on the test and on the criterion are influenced by a number of factors (language skills, age, motivation) There factors can influence scores on the predictor or criterion Making the test culturally appropriate does not address the underlying issue -low scores need to be addressed directly
	Intercept bias	Even though tests show comparable predictive validity across groups, intercept bias may still be present Intercept bias is present if the test consistently over or under estimates performance on the criterion by one group compared to another
	Case 3	Although the valdidty coefficient is the same for both groups, any score on the test (x) will lead to different criterion scores for the groups Test scores have different predictive meaning for two the two groups Selecting people on the basis of majority scores underpredicts minority group criterion performance Case 3 is the situation concerning those who view tests as biased
	Case 4	Use of majority test scores in predictive regression over predicts minority group performance Discriminating in favor of them -evidence on intercept bias indicated that on well constructed tests, there is no significant intercept bias -or a slight tendency in the opposite direction
	When does case 4 happen	Occurs when other variables are correlated with the test and the criterion -reading ability, language proficiency, but primary test familiarity and test preparation
	Current work on test bias	Over the last several decades emphasis has shifted from evaluation of test bias to the design of selection strategies for fair test usage with minority groups Examples Cleary model, compensatory models These selection models cause public uproar but what they do is assign different values to other facts to be considered when accepting minority group candidates
	Cleary model	The use of regression weights to predict job success of other outcomes Selection is based strictly on the test and criterion scores without regard for other goals in the selection process
	Compensatory models	Select larger proportion of minority group members by lowering acceptable test scores or select applicants based on other criteria (proportion of minority applicants)
	Selection models like the Cleary model are called	Expected utility models -in such models clear statements of values are intended consequences of selection decisions that are made explicit Issues such as providing equal opportunity, increasing demographic mix, preferential selection of people from historically disadvantaged groups are all part of the selection process
	No agreed upon defintion for intelligence But there are three broad conceptual ideas about intelligence which can be described	Psychometric tradition -examines the structure of test items, dimensions underlying responses, and correlated of test responses Information processes in approach -examines the underlying encoding, processing and solving of various problems Cogntive approach -focuses in how people solve real world problems and adapt to real world demands
	Development of the Binet scale	In the last 1800s current theory suggested a relationship between head size and school success (craniometry) Benet failed to find any relationship In 1899 Binet dropped craniometry as a measure of intelligence and began a search for other measures Binet returned to measuring intelligence in 1904 at the same time Francis Galton published his work on intelligence
	What did Galton believe	That intelligence could be assessed through physical measures -grip strength, reaction time, keenness of vision, auditory acuity, and mental imagery Binet wanted something more
	Benit wanted a measure that reflected	What people do (not who they are) A number or numbers that reflected whether questions were answered correctly Answers to questions that indicated and underlying mental process
	What did Benit ask children to do	Asked children to respond to asks that relect common experiences -counting coins, giving and receiving instructions, making simple inferences, answering questions and solving problems Tasks were presented by a trained tester Items were graded in terms of difficulty and covered a wide ranger of problems
	These tasks used by Benit were thought to underlie three processes that reflected intelligence	Comprehension Invention Correction
	General mental ability	General mental ability meant that the reason why children who were correct in one question were also right on others is because intelligence is a General mental ability made up of several different processes (postive manifold) He thought that performance on a wife range of varied tasks could reflect a measure of General mental ability
	Original Benit scale	Original 1905 scale has 25 age graded items The 1908 scale by Theodore Simon had 32 and an age criterion for each item Start with the simplest items and progress untill the child continues to make mistakes
	What was considered normal intelligence	The criterion benit and Simon adopted was that the age at which children could correctly answer a question 66%- 75% of the time The age associated with the last correct answer became known as the child's mental age -children whose intelligence level was less than 0 were identified for special education
	Mental age	Benit and Simon defined a child intelligence level as mental age - chronological age
	What does benit define by intelligence	Adopted a functional perspective where intelligence must be relected in behaviors thar are adaptive and goal directed -Take and maintain a definite course of action (comprehensive) -Capacity to change plans or method to attain a desired end (invention) -Ability to see errors and correct them (correction) Intelligent people use information more efficiently to meet their desired goals than do less Intelligent people
	What did Stern 1912 argue	That mental age should be divided by chronological age to give an intelligence quotient -division is more appropriate than subtraction because the relative not absolute difference between mental age and chronological age is important Interest not becomes the rate of development relative to age
	Terman 1916	Brought the binet-simon test to the US where it became the standford- Binet Ushering in the modern age of intelligence testing
	Spearman (physiological efficiency)	Benit saw intelligence relected in people's ability to solve problems that arise when attaining their desired goals The problem of this idea was that some people had more functionally adaptive problem solving ability than others Spearmans two factor theory of intelligence saw intelligence as a central general ability (g) and levels of specific abilities (s) -sought to isolate intellectual power from knowledge content
	How does Spearman see intelligence	Intelligence was less about goal directed activities and more about abstract reasoning -our ability to perceive and apply relationships Termed this abstract reasoning ability as g (general mental ability) G is one of the best predictors of occupational and educational success
	How is G measured	Analogy problems that require people to perceive relationships between problem components and apply those relationships to the problem Analogy problems can be expressed verbally or in symbols, pictures, and geometic forms
	Functional unity of intelligence	A physiological structure from which a mental energy or process flow Spearman and others thought thar differences in intellectual functionial reflected a functional unity One measure of inner unity is neural speed or processing speed -idea is that the faster a person can process information the higher their level of intelligence
	Speed of response as a measure of intelligence	With a speed measure it is critical to seperate speed from knowledge It is processing power (g) that is being measured not expertise or past knowledge To separate processing power from expertise you have to use tasks that are completely novel or tasks that are very familiar or easy
	Response time from movement times	It is also essential in speed measures to separate response time (decision time) from movement time Done by separating movement time from response time
	Galton and reaction time	Reaction time as a measure of intell was suggested by galton before binet produced the frist intelligence test in 1905 Every methodological and statistical problem conceivable plagued Galtons attempts at measuring response time as a measure of human intelligence Benits new intelligence test had obvious face validity for intelligence and Galtons idea of chronometry was easily overtaken and soon forgotten
	Advantages of chronometry	One advantage lies in the scale of measurement IQ tests results produce an Ordinal scale Level of intelligence is always relative to the scores of others in norm group Speed based measures produce an absolute ratio scale -An response time of 30 msconds is twice as fast as 60msec no matter who takes the test
	Advantages of chronometry	There is a theoretical advantage with chronometry Binets approach to intelligence was entirely functional -why a person scored the way they did was not Binets concern Time based measures permit theoretical development -why are some people faster than otherw Fast response times reflect a speedy rate of oscillation in neural responsiveness and are therefore indicative of intelligence
	Inspection time as a measure of physiological efficiency	Spearman thought that reaction time might reflect encoding sensitivity and retrieval speed While informative reaction time data are difficult to interpret because it is difficult to know what processes are being assessed Inspection time is an alternative measure of physicaologicl efficiency -dependent variable is the correct answer not response speed
	Inspection time accuracy	Accuracy is assessed against exposure time The longer the exposure time the greater the accuracy but the lower the processing speed For each person, exposure duration given 75% accuracy is recorded
	What actually is inspection time	Since it is concerned with accuracy and exposure duration It is refered to as the speed of taking in information and encoding sensitivity Correlations between inspection time and tests loading on G are (-.50) to (.55) As inspection time increase intelligence test scores decrease
	Correlations between response time and test scores	0.2- 0.3
	Does inspection time cause intelligence or vis vera	Several studies report that intelligence causes fast inspection time Developmental studies suggest that inspection time at an early age are more closely related to IQ at a later age than visa versa Inspection item is a measure of something occurring inside the head, it is not a construct, a process or explanation
	What is inspection time taping into	While evidence is incomplete Inspection time tape into the efficiency with which information is processed after it has been received
	Fast inspection time and high IQ scores	Fast inspection time and high IQ scores occur in people whose evoked potential are maximal at 140-200ms after stimulus display
	Cigarette smoking	Cigarette smoking is associated with fast inspection time and higher scores on the Ravens matrices This happens beacue smoking stimulates the brains cholinergic processes underlining one mechanism that underlines intelligence
	Cattells two favors theory of intelligence	Proposed that intelligence is composed of two components Fluid and crystallized abilities These abilities are called Gf and Gc
	Gc crystallized abilities	Relect past lessons or will learned responses that have become crystallized (reading and driving) Reflected in shared educational experiences and seen in tests of computational speed, word recognition, pattern matching, basic information, vocabulary
	Gf fluid abilities	Label applied to an adaptive process of encoding and correctly processing unfamiliar configurations, rearranging those configurations to meet some requirement (Ragens matrices or black design test) Is spoken of in the singular but there are several components to this factor
	Relationship between fluid and crystallized abilities	Crystallized develops out of fluid because when tasks are new there is no crystallized knowledge to use When people of equal fluid differe in crystallized, the reason probably reflects educational experience, motivation or environmental factors Although fluid is largely perceptual and verbal in nature, words (crystallized) are needed to formulate and check Hypotheses and answers Crystallized abilities such as language and numerical skills are needed to express creative and novel ideas that.come from fluid abilities
	What do fluid and crystallized reflect	Fluid is without content reflecting pure intelligence power. Crystallized reflects book learning, experience, or knowledge and really isn't intelligence -this is misleading and incorrect
	How much fluid and crystallized abilities do we need	How much fluid ability a task requires reflects the culture and learning history With enough practice or experience any new test or problem could become crystallized
	Fluid and crystallized abilities in intelligence tests	Many intelligence tests reflect fluid and crystallized abilities No matter how bright a person is, a poor reader will not do well on ability tests

Share This Flashcard Set

Set the Language

Related Flashcards

Assessment Test 3 (Year 5 Term 1)

Add to Folders

Upgrade to Cram Premium

Card Range To Study

197 Cards in this Set