Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
116 Cards in this Set
- Front
- Back
What is a test |
A measurement device or technique that 1) represents a sample from a domain 2) puts numbers (quantifies) to that behavior 3) aids in understanding and prediction of that behavior 4) contains sampling and measurement errors |
|
The history of tests |
Tests used in China in 2200 BC to select applicants for government jobs Durung Han dynasty applicants had to show proficiency in music, archery, writing, geography ect Knowledge of classical literature was later added under the assumption that a Knowledgeable person had absorbed the wisdom of the past In 4th century the program was expanded to require examinees to spend days in isolated booths composing essays and poems |
|
How did the brass instrument Era affect |
Caused the move toward measuring human abilities using objective procedures that could be easily replicated Used a variety of instruments (made of brass) to measure simple sensory and motor processes based on the assumption that they were measures of general intelligence |
|
Who was Sir Francis Galton |
The father of mental tests and measurement Established an anthropometric laboratory in 1884 -Where he collected physical, sensory, and motor measurements of over 17,000 individuals -Was the frist large scare systematic collection of data on individual differences Introduced the term psychometry |
|
What is psychometry |
The art of imposing measurement and number unpon operations of the mind |
|
What were James McKeen Cattell beliefs |
Shared Galtons belief that simple sensory and motor tests could be used to measure intellectual abilities Later research demonstrated that this belief was too simplistic |
|
What did James McKeen Cattell do |
Helped in opening psycholgocial laboratories and spreading the growing testing movement in the USA First to use the term mental test Was the mentor of several well known psychologist in the late 1800s and mid 1900s |
|
Who is Clark Wissler |
One of Cattells students who's work largely discredited the work of Cattell |
|
What did Clark Wissler do |
Found that common sensory and motor measures used to assess intelligence had no correlation with academic achievement and weak correlations with earchother New approach to intellectual assessment that emphasized higher order mental processes Flaws in research prevented him from detecting moderate correlations between sensory and motor tests and intelligence -60 years till relationship would be known |
|
Who is Carl Gauss |
One of the greatest mathematicians of all time for his contributions to -number theory -geometry -probability theory |
|
What did Carl Gauss do |
Tracking stars -people had them at different points
Plotted them and determined observations take the shape of a curve
The best estimate of true location is the mean and each observation contained some degree of error
Known as normal curve or distribution -sometimes called Gaussian curve
Also did work on magnetism which his magnetic intensity is called Gauss |
|
Who is Abraham de Moivre |
Studies of probability laid the foundation for Gauss discovery of then normal distribution Is credited with the development of standard deviation |
|
What is a psychological test |
A set of items designed to measure the magnitude of one or more traits, characteristics, attributes of human behavior |
|
The attributes used in psychological tests |
Attributes are not directly observable -they must be inferred from test responses Inferred attributes are psycholgocial constructs |
|
What are latent variables |
The inferred attruites from test responses |
|
Items in psychological tests contain |
Overt behaviors or covert behaviors reflected in responses to test items A sample of the relevant behavior |
|
Test scoring |
Item responses are scored and evaluated according to some criteria on an objective scale The meaning of any test score is relative to an external criteria Sampling error and measurement errors are always present |
|
What are the 10 assumptions of psychological assessment |
Know |
|
Assumption 1 |
Psycholgocial construct exist -a construct is the trait that a test is designed to measure |
|
Assumption 2 |
Psychological constructs can be measured -is something exists it exists in some amount which can be measured -experts believe Psychological and educational contracts can be measured |
|
Assumption 3 |
Measuemnt of constructs is no perfect -imperfection is framed in terms of measurement error and the effects the error on reliability of test scores and test validity |
|
Assumption 4 |
There are different ways to measure any construct -a variety of assessment procedures all focusing on the same construct should be used to assess that construct |
|
Assumption 5 |
Assessment procedures have strengths and limitations -need to understand the specific strengths and weaknesses of procedures used |
|
Assumption 6 |
Multiple sources of information should be part of the assessment process
-decisons should not be based on a single test beqcued of the specific strengths and weaknesses of each test |
|
Assumption 7 |
Performance on tests can be generalized to non test behaviors -we need to beable to generalize from test results to non test situations -test scores tell us the person's standing on the construct and their standing on other constructs of interest |
|
Assumption 8 |
Assessment provide information that helps psychologists make better professional decisions |
|
Assumption 9 |
Assessment can be conducted in a fair manner -when given to the population the test was designed for, according to proper procedures, and interpreted according to guidelines, tests are fair and minimize bias |
|
Assumption 10 |
Testing and assessment can benefit individuals and society as a whole -when a psychologist is able to diagnose accurately a client's problem, they are more likely to develop an effective treatment plan -when psychologists help clients better understand personal preferences and career interests they are more likely to pursue activities that lead to a happier life and career success |
|
Why should we use tests (first myth busting revelation) |
When properly constructed tests are a more reliable and valid means of assessing people than other methods |
|
Tests for maximal performance |
Tests for human abilities
-aptitude (future) -achievement (past) -intelligence (general potential) |
|
Tests of typical performance |
Personalitiy Interest and occupational inventories Tests of psychopathology |
|
Why use tests that differentiate between maximal and typical performance |
Main use of these test is to differentiate between people -assess individual differences
-total rest scores are assumed to reflect real psycholgocial difference between people in the amount of the latent variable |
|
What are the 4 types of tests that categorize on the basis of function (4 types of test categories) |
Classification tests Self understanding Program evaluation Scientific Inquiry |
|
Classification tests |
Tests items that assign people to one category or another
-screening, selection, licensing, placement, diagnostic |
|
Self understanding test |
Test scores help people make decisions about their lives by providing information that may correct any false ideas they may have about themselves |
|
Program evaluation tests |
Measure changes in behavior as a result of programs introduced into schools, industry, social services
-question is whether the program is meeting the stated goals and where improvements could be made |
|
Scientific Inquiry test ☆ |
Test scores used as either dependent or independent variables |
|
Test scores are useless when |
Test scores tell us current level of construct but we want to know future level
Test scores are useless unless they allow us to address certain concerns, prevent something, predict somthing or formulate a plan of action -concerns speak to validity and utility |
|
How to distinguish between testing and evaluation |
Testing is the giving and scoring of test items within the purpose of measuring the magnitude of some latent psychological construct
Evaluation refers to how test scores will be used and to what purpose will they be directed
The distinction between testing and evaluation is one of tools and processes |
|
Evaluation Assessments |
Includes more than just test scores
Each assessment component must have reliability, validity and utility
The aim of all assessment procedures is to answer questions about psycholgocial or social functioning |
|
What are the two types of assessment procedures |
Collaborative psyvholgovial assessment -assessor and assesse work collaboratively
Dynamic assessment -used in educational setting -assessor and assesse are involved in an interactive, changing or varying assessments -on going |
|
Three phases in dynamic assessments |
1)Evaluation
2) Intervention -happens while client us doing a task, after the intervention or some time in future
3) Evaluation of intervention effectiveness
Nature of intervention depends on nature if what us being assessed |
|
What are the qualities of a good test |
1) nature of test items 2) test standardization 3) objectivity 4) norms 5) reliability 6) validity |
|
Nature of test items |
What us of concern is the total test score which tells us about the underlying attribute
It is not necessary that the test items reflect the behavior the test predicts -face validity
It is necessary that there is an empirical link between total test score that the behavior or outcome that the test is supposed to measure |
|
Test standardization |
Is the uniformity and consistency in test administration and scoring test responses
Scoring and administration procedures are given in the test manual |
|
Objectivity |
Refers to how the test is scored
There is numerical scale on which responses are given or a score template which can quantify responses
Does not refer to the nature of the items |
|
Norms |
Are the distribution of test scores in a relevant population
Norms are necessary beacue test scores have no absolute meaning
Interpretation depends on the relative comparison which similar others |
|
Reliability |
Consistency and accuracy of test scores across time or with sets of items
Inconsistency or inaccuracy (lack of reliablity) means the presence of measurement or smapling errors |
|
Validity |
The degree to which inferences or meaning from test scores can be made
Can only be known empirically by establishing validity coefficients
Not a single number buy a network of empirical associations |
|
What is a valid test |
Measures what the test says it measures and not something else
Is useful or not to the user (test utility)
Adds to what is known about the individual (incremental validity) |
|
How good are psychological tests |
Meta analysis on the validity of psycholgocial tests concluded that the overall validity of established tests are compelling
Validity of established tests was comparable to medical tests
Direct assessment methods provide unique and non overlapping sources of information |
|
Second myth busting revelation |
Reliance on interviews as the sole source of information leads to incomplete and misleading understanding of the individual When properly constructed, tests are more reliable and valid than other methods of assessing people |
|
What are scales of measurement |
Specific procedures for transforming latent constructs into numbers |
|
What are the 4 types of scales |
Nominal - purpose is to name objects Ordinal -allows for rank -has magnitude
Interval -magnitude and equal intervals (temp)
Ratio -all three |
|
What are the properties of scales that make them different |
Magnitude -has magnitude if an instance of the attribute represents more or less or equal amounts compared to another instance Equal interval -if the difference between two points is consistent Absolute zero -when nothing of the property being measured exists |
|
What are the different descriptive statistical concepts |
1) measures of central tendency 2) measures of variability |
|
What are the 3 measures of central tendency |
Mean -the average Median -middle score Mode -most fruenquent |
|
Measures of variability |
Range Variance Standard deviation Z scores Quarties |
|
What are quartiles |
A distribution of scores can be divided into four parts that 25% of the scores occur in each quarter
Each quarter is called a Quartile (Q1, Q2, Q3)
Quartile is the point in the dustibution (3)
Quarter refers to the interval (4) |
|
What is an interquartile range |
Is a measure of variability defined as the difference between Q1 -Q3
Like median it is an Ordinal staristic
Differences in distance from Q1 and Q3 from Q2 tells us about the shape of a disruption -in normal distribution Q1 and Q3 will be the same from Q2 -can show skwedness |
|
What do percentile ranks and percentiles do |
Are a ranking procedure that adjusts for the number of scores in a group
Are sample dependent descriptions of a distribution of scores |
|
What are percentile and how to calculate |
Indicate the particular score below which a defined percentage of scores falls
Arrage the raw score data in decending order and then = (number of scores below) ÷ (total number of scores) × 100 |
|
What is percentile rank |
Indicates the percentage of scores that fall below the observer scores in raw score units |
|
What are the problems with percentile ranks |
Cannot be used to compare scores across different norm based tests
Arithmetic operations cannot be used in percentiles
This is because score compression and expansion |
|
What is the standard normal distribution |
Theoretical distribution of scores that contain a fixed proportion of scores from the mean (Gaussian curve)
Expressed in standard deviation units or z score units
Percentile ranks, z scored and normal distribution are all interrelated |
|
What are Z scores |
Transforms data into standardized units that are easy to interpret
Is the difference between score and mean decided by standard deviation
Z scores= (raw score- mean)÷ SD |
|
Score transformations and their benefits |
Z scores are a type of linear score transformation Translations are stable across samples and you can compare scores across tests |
|
Types of distributions |
Bi modal dustibution - 2 humps (means) Postive skewness distribution -mode, median mean (right tail) Negative skewness distribution -mean, median, mode (left) |
|
Distribution peeks |
Leptokurtic (highest) Mesokurtic (medium) Platykurtic (no hight) |
|
What do we do when distributions are skewed enough |
Normalizing a distribution -stretch the distribution to make it more normal and create a corresponding scale of standard scores -is called a normalized standard score scale Necessary when comparisons are to be made between two or more tests |
|
Norms and norm groups |
Gives the meaning of raw scores Norms are based on the distribution of scores obtained by a defined group of individuals Give information about individuals performance relative to the performance in the standardized sample |
|
Normed tests permit |
Evaluation of individual scores against those of the norm group Compassion of scores across different tests -assuming the same norm group is used across different tests |
|
Two types of norms are |
Norm referenced test -ranking results because individuals scores are compared against those of the norm group -may breate competition by forcing people to do better than high ranked others -indivates where problems lie and suggests a course if remedial actions Criterion/ domain referenced tests -involves assessment of skills, abilities or knowledge in a domain -used to design programs to increase skills -used to diagnose problems -important implications for standards testing in education |
|
Why we need deviation IQ scores |
Standard deviation of test scores vary across age groups This makes it impossible to compare across age groups Because standard deviation problem the OQ formula is not used to compute an IQ score |
|
What are deviation IQ scores |
Is the transformed score with a mean of 100 and an SD of 15 = (raw score- mean) ÷ SD of thr age band from the standardization sample By correcting for SD difference comparisons across age groups can be made |
|
Age grade norms |
Used in schools to assess reading and arithmetic skills
These norms differe in important ways from other norms
In other normative comparisons the norm group is known but here the norm group is unknown and heterogeneous |
|
Why age grade norms should be avoided |
When used in schools such norms assume that there is a common curriculum for those in the norm and subjective groups Far too easy to misinterpret high scores |
|
What should be done when given an age grade comparison |
Frist ask what is the comparison or norm group If comparison is local ask how and on what basis we're the norm groups were constructed Ask on what curriculum us the test based Ask what is thr child's percentile score and then convert it to z scores |
|
Developmental norms |
Norms that indicate progression along developmental path -Piagets stage, new born Apgar scores These are Ordinal scales, have descriptive appeal, but hot suitable for spasticity treatment |
|
What is an Apgar test |
A newborn is assessed on a score of 0, 1 or 2 on -heart rate -respiratory effort -muscle tone -reflex irritability -skin tone Greater than 8 means all is well 5-7 need stimulation Less than 5 needs medical intervention |
|
What ud a scatter plot |
Visual representation of association between two variables
Called bivariate plot Corellation coefficient summarizes information in a scatter plot |
|
Line of best fit |
Coreelation coefficeint summarizes info in a scatter plot by finding the best sitting straight line that minimizes the differences between two variables and maximizes the relationship between them |
|
What is correlation coefficient |
Describes both the magnitude and direction of the relationship between two variables |
|
Pearson correlation coffecient |
Pearson r us a ration used to determine the degree to which the variance in one variable can be determined by or predicted from knowing the variance in the other r = sum of xy ÷ (N (sum of x^2 y^2) )^2 Converts each score to z scores, cross multiplying and dividing by n r= (Zx) (Zy) ÷ N person r assumes that both variables are measured on least an inertial scale and are continous in nature |
|
What to do when variables are dichotomous |
Alternative forms of correlation coefficients are use:
1) biserial correlation coefficient 2) point biserial correlation coefficient 3) tetrachoric correlation coefficient 4) phi coefficient 5) Spearman rank order coefficient (rho) |
|
Biserial correlation coefficient |
When one variable has been made artificial and the other continous
Y artifical x continous Y continous x artificial |
|
point biserial correlation coefficient |
When one variable has true dichotomy (agree or disagree) and a continous variable |
|
tetrachoric correlation coefficient |
X and y are artificially created |
|
phi coefficient |
Both dichotomous but one is true the other is artifical |
|
Spearman rank order coefficient (rho) |
Correlation between ranked scores When actual scores are unknown or scores are not normally distributed |
|
What us multiple regression |
Statistical techniquesl for studying the relationship between one or more dependent variables (criterion) and one or more independent (predictor) variables |
|
What are the two main uses of regression |
Prediction
Causal analysis
|
|
Regression prediction |
An equation is derived that relates the criterion variable to the predictor variable
This equation is used to explain the degree to which scores on the predictor variables can be predict scores on criterion variables |
|
Regression causal analysis |
In causal analysis it is assumed that the predictor variable causes the criterion The aim of analysis is to -determine the relation between criterion and predictor -estimate the magnitude with which each independent variables influences the dependent variables |
|
What is the full name of mutiple regression |
Ordinary least squares multiple linear regression Multiple means that there could be more than one predictor variable Linear means relations between predictiors and criterion is expressed in a straight line |
|
Ordinary least squares refers to |
How the regression equation is calculated Least squares refers to equations that minimize difference between predictors and criterion giving the best possible linear fit between them |
|
Regression equation |
Y'= a + bx
Y is predicted criterion
X is the predictor variable
B is the regression weight or slope -represents the strength and direction of relation between x and Y
A is a constant where the Y value is when x is 0 |
|
What are we doing in regression |
We are predicting Y from the x or in regression terms regressing Y onto x Linear equations are tye simplest way to accurately describe relations between variables and still get accurate predictions |
|
What are residuals |
Deviations from prediction
Ordinary least squares is that soloition that minimizes residuals |
|
Advantages to mutiple regression |
Allows for simultaneous calculation for effects of the relationship between predictor and criterion variable While also controlling for -the correlations between predictors -correlation between other predictors and criterion Negative regression weights are possible |
|
What is multiple R |
The overall correlation between Y and weighted x's |
|
What us R squared |
The total proportion of variance accounted for in Y by the weighted x's |
|
F ratio |
Answers whether the regression differs significantly from 0 F ratio= SS regression ÷ SS error Df= k÷ (N- k- 1) K = number of predictors |
|
T test |
Used to test the significance of the regression weights Tests whether each is significantly different from 0 |
|
Coefficient of determination |
R squared Proportion of variance accounted for in a person correlation |
|
Coefficient of alienation |
Proportion of variance unaccounted for in person correlation = square root of (1-r squared) 1 - determination |
|
Cross validation |
Application of the regression question to a new sample to determine if the equation is equally predictive in another sample |
|
Shrinkage |
Decrease in R and R squared when the regression is applied to a new sample |
|
What is standard error of estimate |
It is the standard deviation of residuals Sxy= the square root (the sum of (Y-Y') squared) ÷ (N - 2) Functions the same as standard derivations Standard error expresses deviations around the regression line Regression line is a moving mean Can calculate confidence intervals |
|
What is factor analysis |
Statistical technique designed to examine the correlations between tests or test items to determine the underlying factor structure Used to identify whether a pattern of correlations derived from responses to test items can be explained by a smaller number of dimensions called factors |
|
Unidimrnsional analysis |
Suggests that all the items measure the same underlying construct Determined by nature of items |
|
Multidimensional analysis |
There is more than ine factor underlying response to the test items Determined by nature of items |
|
Steps of factor analysis |
1) A person by item matrix is entered into program 2) program frust create an item x item correlation matrix 3) items that correlate strongly with other and weakly with other items from a factor (factor 1) 4) remaining items that coreelare with other but not the first 5) correlated between themselves |
|
Two types of factor analytic procedures |
Confirmatory factors analysis Explanatory factory analysis |
|
Confirmatory factors analysis |
Test constructors use theoretically derived hypothesis to predict how many factors would be expected from item responses Main way of establishing construct validity |
|
Explanatory factory analysis |
Is used when the number of factors is unknown Should not be used to test construct validity because the number of factors was not specified before hand Can use confirmatory after exploratory |