A researcher is interested in determining the effects of a new behavior modification program and a new drug in increasing the vocabulary of mentally retarded children. The various groupsof subjects will receive one of four dosages of the drug (placebo, 10 mg, 20mg, and 30mg) and either the program or attention without the program. Afterwards, the WISCIII vocabulary subtest will be administered to subjects.
Independent Variabiles(s) and levels: Dependent Variables(s): 
IVs: Behavior modification (levels = program and attention only) and drug(levels = placebo, 10mg, 20mg, and 30mg)
DV: Scores on the WISCIII vocabulary subtest 

A researcher wants to know if high school and college students differ in their attitudes toward affirmative action. He divides both groups of students into AfricanAmerican and Caucasian groups and administers the measure to each of the four groups.
Independent Variables(s) and levels: Dependent Variable(s): 
IVs: School level (high school and college) and race (Caucasian and AfricanAmerican)
DV: Attitudes toward affirmative action 

A researcher wants to know if elderly individuals diagnosed with Alzheimer's disease differ from nonAlzheimer's elderly individuals in terms of a variety of physiological measures, including pulse rate, blood pressure, EEG patterns, kneww reflex, and white blood cell count.
Independent Variable(s) and levels; Dependent Variables 
IV: Diagnosis (Alzheimer's and no Alzheimer's)
DVs: Pulse rate, blood pressure, EEG patterns, kneww reflext, and white blood cell count 

A therapist devises a new form of psychotherapy. She obtains separate samples of depressed, anxioux, psychotic, and personalitydisordered clients. She then randomly assigns subjects into groups that will receive either her new form of therapy, traditional psychoanalysis, cognitive therapy, or humanistic therapy. Before and after therapy is completed, she obtasins the subjects' WAISIII fullscale IQ scores, as well as their scores on the BDI, the BAI, and the MMPI=s's Paranoia and Schizophrenia Scales. She also believes that outcome depends on economic status, so she further divides subjects into high, medium, and low income groups.
Independent Variable(s) and levels: Dependent Variable(s). 
IVs: Pathology (depressed, anxious, psychotic, personalitydisordered), therapy (new, psychoanalysis, cognitive, humanistic), and income (high, medium, and low)
DVS: Wais full scale IQ scores,BDI, BAI, MMPI paranoia and Schizophrenia scale scores. 

What is the threat to internal validity that is most salient?
In a research study testing the effects of a new strategy for increasing shortterm memory, subjects have to wait three hours in a classroom before the study begins. As a result, they become very tired and cannot concentrate on learning teh strategy. 
Maturation


What is the threat to internal validity that is most salient?
A drug designed to improve emotional and psychosocial functioning is administered to severely depressed individuals. 
Testing


A film designed to increase the racial awareness of white college students is shown the day after a leading civil rights activist spoke at the college.

history


A new diet designed to help obese subjects lose weight is studied. Subjects are to be weighed before and after they go on the diet. Before they are weigned for the second time, the scale breaks.

instrumentation


A study is conducted to test the effectiveness of Academic Review's workshops. Most of the subjects in teh study have taken the psychology licensing exam before.

testing


In a study conducted to test the effeectiveness of a new reading strategy on reading comprehension, the first 20 subjects who sign up are assigned to the experimental group and the second 20 subjects who sign up are assigned to the control group.

selection


Cues in the experimental setting that allow subjects to guess the research hypothesis.

demand characteristics


The effect that an experimenter's expectancy has on the results of a research study.

Rosenthal Effect


A procedure designed to ensure that all subjects in the population of interest have an equal chance of being chosen to partcipate in a research study.

random sampling


The tendence of subjects' behavior to change due to the attention received in a research setting.

Hawthorne Effect


A procedure designed to ensure that all subjects in a research study have an equal probability of ending up in each of the treatment groups.

Random assignment


What is a statistical method of controlling for the effects of an extraneous variable?

ANCOVA


What is a procedure that involves grouping subjects who are similar in terms of their status on an extraneous variable and then assigning the members of each group to different treatment groups?

matching


What is a procedure in which a population is divided into "subpopulations" and all members of each subpopulation have an equal probability of being chosen to participate in the research study?

stratified random sampling


What is a study of the effects of aging on psychosocial adjustment that involves comparing older, middleaged, and younger subjects at one point in time?

crosssectional research


What is a study of the effects of aging on psychosocial adjustment that involves comparing older, middleaged, and younger subjects at different points in time?

crosssequential research


What is a single subject design in which the treatment is withdrawn to determine whether dependent variable scores revert to baseline levels?

reversal design


What is an indepth study of a single individual, institution, group, or phenomenon?

case study


What involves manipulated variable(s) and random assignment?

true experimental research


What involves manipulated variables and nonrandom assignment?

quasiexperimental research


What is a study that involves obtaining dependent variable scores from a group of subjects on multiple occasions at regular intervals?

timeseries design


What is a study of a single conductdisturbed adolescent in which the effectiveness of a new behavior modification program is assessed first at school, then at home, and then in the community?

Multiple baseline design


What is a study assessing the association between SAT scores and college GPA?

correlational research


What is a study of the effects of aging on IQ scores in which one group of subjects is examined for 30 years?

longitudinal research


What is the greatest threat to validity when a mail survey is conducted?
a. selection b. maturation c. randomization d. instrumentation 
A. Selection  in a mail survey, subjects selfselect themselves into the study; i.e., in deciding whether or not to mail the survey back, they decide who the study's participants will be. Thus, selection poses a threat to any mail survey's validity.


Which of the following statements regarding the different types of developmental research (longitudinal, crosssectional, crosssequential) is most true?
a. Crosssequential and longitudinal studies are particularly vulnerable to "cohort" effects. b. Crosssequential studies are more costly in terms of time and money to conduct than longitudinal studies. c. Of the three types of developmental studies, crosssectional studies offer the greatest internal validity. d. By combining the methodology of crosssectional and longitudinal studies, crosssequential studies reduce many of the problems associated with both. 
D. A crosssequential study, like a crosssectional study, involves studying groups of subjects that are divided on the basis of age. And, like a longitudinal study, it involves examining the subjects for a period of time, though this period is shorter in crosssequential studies. Crosssequential studies, because they involve studying subjects over time, control for "cohort" effects, which often confound crosssectional studies. And, because theya re shorter than longitudinal studies, there is less cost in terms of time and money, and less subject dropout.


The defining feature of true experimental designs is
a. random selection of subjects from the population b. random assignment of subjects into experimental groups. c. the use of manipulated variables d. the use of nonmanipulated variables. 
B. In a true experiment, variables are manipulated and subjects are randomly assigned to treatment groups. Choice C is incorrect because other types of designs (e.g., quasiexperiments) also use manipulated variables.


Francis Galton's concept of regression to the mean is best expressed by which of the following statements?
a. Individual variation within a species is unlimited. b. short fathers have tall sons. c. Short fathers have taller sons. d. Fathers and sons end up having similar heights. 
C. Regression to the mean refers to the tendency of extreme observations to be less extreme upon retesting or reobservation. Francis Galton applied this concept to heredity. He concluded that due to regression to the mean, individual variation in the species is limited (the opposite of choice A). That is, since extreme individuals will likely have less extreme offspring (e.g., short fathers are likely tohave taller sons), the characteristics of a species can only vary within a limited range. Note that you did not need to know anything about Galton to answer this question. You just needed to apply what you know about regression to the mean to a new situation.


All of the following are true of multiple baseline designs, except
a. a treatment is sequentially applied. b. they may serve as a substitute when the ABAB design is unethical. c. They may involve studying the same treatment for different behaviors, in different settings, or with different subjects. d. they involve the administration and then the withdrawal of a treatment. 
D. A multiple baseline study is a singlesubject study that involves the sequential application of a treatment across different baselines (i.e., behaviors, settings, or individual subjects). Unlike a reversal design, a multiple baseline design does not entail withdrawal of the treatment.


The major threat to the internal validity of onegroup timeseries design is
a. maturation b. regression to the mean c. history d. testing 
C. A onegroup timeseries design involves administering multiple pretests and posttests to one group of subjects before and after a tx is administered. The design controls for many threats to internal validity, such as maturation, testing, and statistical regression. The major threat 2 its internal validity is hx, or an external event that occurs at right about the same time the tx is administered.


A major advantage of case studies is that they
a. can be used to identify variables for future research. b. involve the study of only one individual c. allow one to draw conclusions about the causal relationship between two or more variables d. permit the generalization of results to other cases 
A. Although case studies cannot tenably be used to identify causal relationships between variables, they are often useful as pilot studies to identify variables and hypotheses for further investigation.


A major disadvantage of case studies is that they
a. can never be used to identify variables for future research b. involve the study of only one individual c. do not permit conclusions to be made about causal relationships between variables d. are frowned upon by journal editors, college professors, and other "scientifically correct" individuals. 
C.do not permit conclusions to be made about the causal relationship between variables.


Data collected in research studies can be classified into four types:

nominal, ordinal, interval, ratio


A(n)_________ scale of measurement contains unordered categores; examples include gender, DSM dx, and haircolor.

nominal


______ data is quantified into ordered categores; however, with such data, ir is impossible to determine the distance tetween data point. Examples include ranks and points on an attitude scale.

ordinal


____ data are continuous data in which the distance between successive data points is equal across the scale. However, there is no absolute zero point; as a result, multiplication and division with such data are not possible. Examples include IQ scores and degrees Fahrenheit.

interval


Finally, _____ data is the same as interval data except that it includes an absolute zero point and mult and division can be performed.

ratio


In a ____ distribution, most observations (i.e., scores) fall in the middle of the distribution, with fewer and fewer cases as one moves farther away from the middle.

normal


In a _________ distribution, most scores fall at the high end of the distribution, with a few extreme scores falling at the low end.

negatively skewed


In a _______ distribution most scores are low and a few extreme scores are high.

positively skewed


in a _______ distribution, the mode is higher than the median, which is higher than the mean.

negatively skewed


In a _____ distribution, the mean is higher than the median, which is higher than the mode.

positively skewed


In a _____ distrubition, the mean, the median, and mode are all equal.

normal


The variance is a measure of the ______ of a distrubution.

variability (or dispersion, or spread)


The standard deviation, a measure of the same property, is obtained by taking the ________ of the variance.

square root


A Zscore is an individual score expressed in terms of standard deviation units above the mean. For example, in a distrubition with a mean of 80 and a standard deviation of 2, a score of 76 would be equivalent to a Zscore of ____, and a zscore of +3.0 would be equivalent to a raw score of ____.

2.0; 86;


The formula for a Zscore is ________, where X = ______, M = ________ and s.d. = _______.

(XM)/s.d.
X = raw score M = mean s.d. = standard deviation 

A percentile rank is a transformed score that reflects the percentage of scores falling ______ the corresponding raw score. For example, a PR of 80 is higher than ____% of the other scores in the distribution; it also could be said to be in the top ____%.

below
20% 20% 

By definition, percentile ranks have a _____ distribution; for example, in any distribution, the number of scores falling between the values of 10 and 20 is equivalent to the number of cases falling between 80 and 90.

flat (or rectangular)


Therefore, almost all transformations of raw scores to percentile ranks would be termed ______ since they would involve a change of the original distribution's shape.

nonlinear


In a normal distribution, approximately ____% of scores fall between the zscores of +1.0 and 1.0.

68%


About ___% of scores fall between the zscores of 2.0 and +2.0.

95%;


Say that 1,000 people take the WAISIII, on which the mean IQ is 100 and the standard deviation is 15. About 680, or ___%, will obtain zscores beteen ____ and ____; i.e., they will obtain IQs between ____ and ____. And about 950, or ____%, will obtain zscores between ____ and ____; i.e., they will obtain IQs between ____ and ____.

68% 1.0 +1.0
85 and 115 95% 2.0, +2.0 70 and 130 

In a normal distribution, it is possible to determine the zscore equivalents of given percentile rank points. For example, a zscore of +1.0 is equivalent to a percentile rank of about _____, and a percentile rank of 98 is appriximately equivalent to a zscore of _____.

84; +2.0


If you had a test with a mean of 25 and a standard deviation of 5, you would set the cutoff score at ____ if you wanted to select the top 16% of examinees and at ____ if you wanted to select the top 2% of examinees.

30
35 

A person receives a score of 90 on a test with a mean of 100 and a standard deviation of 5. The corresponding zscore is ____, the corresponding Tscore is _____, and the corresponding stanine score is approximately ____.

2.0
30 1 2 

In addition, if the distrubution is normal, we would know that the corresponding percentile rank is around ____.

2
2% 

If the score were converted to a WAISIII IQ score (mean = 100, s.d. = 15), the new transformed score would be ____. And if the score were converted to an ETS score (i.e., SAT and GRE score, mean = 500 and s.d. = 100), the new transformed score would be ____. see page 60,vol 6/research design & Statistics

70
300 

If you convert raw scores to zscores you would be conducting a
a. linear transformation because the shampe of the distribution changes b. linear transformation because the shape of the distribution does not change 
b. When raw scores are converted to zscores, the shape of the distribution does not change. For instance, if the distribution of raw scores is normal, the distribution of the corresponding zscores will also be normal. When transformed scores retain the same shape as the original distribution, the transformation is said to be "linear."


Eight students take a math test and obtain the following scores: 80, 53, 39, 32, 45, 72, 28, 49. The median score of this distribution is:

To answer thie question, first arrange the numbers in numerical order: 28, 32, 39, 45, 49, 53, 72, 80. To obtain the median, you must take the mean of the two middles scores (45 and 49), which is 47.


One thousand people take a job selection test that has a mean of 60 and a standard deviation of 5. An industrial psychologist wants to select the top 150 scorers. Assuming a normal distribution of scores, she would set the cutoff score at approximately:

First, you have to recognize that the "top 150" is equivalent to the top 15% (150/1,000 = 15/100 = 15%). Then, you have to remember that, in a normal distribution, 16% of all scores will fall at or above a zscore of +1.0. Finally, you have to convert the raw score in the question to a zscore of +1.0. In this case, a score of 65 is one standard deviation above the mean and therefore is equivalent to a zscore of +1.0.
You might have been thrown by the fact that you were looking for the top 15% even though the standard deviation curve only allows you to identify the cutoff score for the top 16%. If so, you might remind yourself at this point to work on the exam with roundedoff numbers (actually, you'll have no choice, since no calculators will be allowed). Fifteen percent is close enough to 16^ for you to use the standard deviation curve. 

Judy and Johnny are students in a school district that is administered a standardized mathematics test. Judy scores in the 48th percentile on the test, while Johnny score in the 93rd percentile. Scores on the test are normally distributed. A few weeks after the scores are reported, a scoring error is discovered, and as a result, three points are added to both Judy's and Johnny's raw score. No changes are made to the score of any other students in the district. Given these facts, which of the following statements is true?
a. Judy's and Johnny's percentile ranks will increase by the same amount. b. Judy's percentile rank will increase more than Johnny's. c. Johnny's percentile rank will increase more than Judy's. d. Neither Johnny's nor Judy's percentile rank will change. 
The answer to this question is related to the fact that, in a normal distribution, there are more scores in the middle of the distribution that at either extreme. As a result, the percentile rank range in the middle of the distribution is much wider than it is at either end of the distribution. Thus, any change to a raw score in the middle of the distribution results in a greater percentile rank change that the same raw score change at the distribution's extremes. In this case, Judy originally scored at the 47th percentile, which is near the middle of the distribution, while Johnny scored at the 93rd percentile, or at the high end of the distribution. Therefore, adding three points to their raw scores willresult in a greater increase in Judy's percentile rank than in Johnny's  due to the change, Judy will "jump over" a greater percentage of other students than will Johnny.


The deviation of a sample statistic from a parameter of the population from which the sample was drawn?

sampling error


The probability of rejecting a true null hypothesis?

alpha


The probability of retaining a false null hypothesis?

beta


The probability of rejecting a false null hypothesis?

power


A researcher hypothesizes that students who sleep with their textbooks under their pillow score higher on the GRE than students who don't. He obtains a sample of 20 students and assigns 10 to the "books under pillow" group and 10 to the "no books under pillow" group. He concludes, on the basis of a statistical test, that his hypothesis was correct. In the population, however, there is no diff on the GRE between groups. What kind of error did he make?

Type 1 Error


A researcher hypothesizes that cog therapy is superior to other forms of therapy in the tx of anxiety. She fails to find any evidence that cog therapy is superior. However, in reality, cog therapy is the superior tx. What error did she make?

Type II Error  accepted a false null hypothesis


In statistical hypothesis testing, because we cannot study the entire population, sample values are used to estimate population vales (a value obtained from a sample is referred to as a(n) _____, while a value obtained from a population is referred to as a(n) _____).

statistic; parameter


The discrepancy between a sample value and the corresponding population value is referred to as ______.

sampling error


The mean is one example of a population value that is estimated on the basis of sample data. The expected discrepancy between a sample mean and a population mean is referred to as the _____.

standard error of the mean


The formula for the standard error of the mean is s.d./square root of N, where s.d. equals ____ and N equals ____.

standard dev
sample size 

The ______ hypothesis of most research studies posits that there is no relationship between the independent variable(s) and the dependent variable(s).

null


The null hypothesis is usually stated in terms of population ____; an example would be "the mean of population A is equal to the mean of population B."

parameters


The ______hypothesis usually posits that there is a relationship between the independent variables and the dependent variables. This hypothesis can either be _______ (e.g., one pop mean is diff from the other) or ______ (e.g., one pop mean is greater than the other).

alternative
nondirectional directional 

In statistical decisionmaking, four outcomes are possible: two are correct decisions, and two are errors. One type of correct decision would be to ____ a true null hypothesis. A second type of correct decision, the goal of research, would be to ___ a false null hypothesis. The probability of making the latter correct decision is referred to as ____.

retain
reject power 

One of the incorrect decisions would be to retain a ______ null hypothesis. This is referred to as a(n) ____ error, and the probability of making it is known as ______.

falst
Type II beta 

The other incorrect decision would be to reject a(n) _____ null hypothesis. This is referred to as a(n) _____ error and the probability of making it is known as ____

true
Type I alpha 

_____________statistical tests are used to test statistical hypotheses when the dependent variable is measured on an interval or ratio scale. Such tests make two assumptions: 1)_____ and 2) ______________.

parametric
normal distribution of data homogeneity of variance 

Methods designed to test statistical hypotheses when the dependent variable is measured on a nominal or ordinal scalre are referred to as ________. These tests don't make the same assumptions as _____ tests. However, tests in both categories do assume that the sample is _____ of the _____ from which it was obtained.

nonparametric
parametric representative population 

In a research study with 400 subjects, the standard deviation of scores on the dependent variable is 20. In this case, the standard error of the mean is:

B. The standard error of the mean is equal to the standard deviation divided by teh square root of the sample size. The square root of 400 is 20. Thus the standard error of the mean in this case is 20/20, or 1.


The standard error of the mean is
a. directly proportional to the standard deviation and inversely proportional to the sample size. b. directly proportional to the standard deviation and directly proportional to the sample size. c. inversely proportional to the standard deviation and directly proportional to the sample size. d. inversely proportional to the standard deviation and inversely proportional to the sample size. 
A. As the population standard deviation increases, the standard error of the mean increases; in other words, the standard error of the mean is directly related (i.e., directly proportional) to the standard deviation. And as the sample size increases, the standard error of the mean decreases; in other words, sample size and the standard error of the mean are inversely related.


Which of the following assumptions is shared by both parametric and nonparametric tests?
a. normal distribution of data b. homogeneity of variance c. random assignment of subjects to experimental groups d. random selection of subjects from the population 
d. Both parametric and nonparametric tests are inferential statistical methods. This means that they are used to draw conclusions about a population on the basis of information derived from a sample. For these conclusions to be unbiased and accurate, a sample must be representative of the population from which it is drawn. The best way to ensure that a sample is representative is to randomly select subjects from the population of interest.


When a statistical test lacks power, this means that
a. the prob of making a TYpe I error will be high. b. the prob of making a Type II error will be low. c. The prob of obtaining statistical significance will be low. d. the prob of getting one's research.... 
c. When a statistical test lacks power, this means that there is a high prob of a Type II error, or that a false null hypothesis will be retained; i.e., the test will be unable to detect a true effect of an independent variable on a dependent variable. Put another way, the test will not yeild statistical significance (a finding of an effect) when it should.


Alpha can be defined as
a. the prob of rejecting the null hypothesis when the hull hypothesis is true. b. the prob of retaining the null hypothesis when the null hypothesis is true. c. the prob of rejecting the null hypothesis when the null hypothesis is false. d. the prob of retaining the null hypothesis when the null hypothesis is false. 
A. Alpha is the prob of making a Type I error, which is defined by choice A. In the Eng lang, this means that alpha is the prob that a statistical test will falsely tell you that your independent variable has an effect, when, in the population, it does not.


Which of the following would have the least meaning?
a. retaining the null hypothesis when power is low. b. rejecting the null hypothesis when power is low. c. retaining the null hypothesis when power is high. d. rejecting the null hypothesis when power is low. 
A. When power is low, a statistical test is unlikely to detect an effect of an independent variable, even when one is present in the pop. In other words, the null hypothesis (the hypothesis of no effect) is likely to be retained. In such cases, when you retain the null, it does not necessarily mean that you have done so corerectly; it could just be that the test lacked the power to correctly reject the null (i.e., to detect a true effect). So retaining the null with low power doesn't really tell you anything.


Subjects take the BDI before and after a six week trial period on the drug?

ttest for correlated samples


Instead of taking the BDI, subjects are either classified by raters as "treatment successes" or "tx Failures."

chisquare


For control & experimental subjects, score on the MMPI's depression scale are obtained in addition 2 those from the BDI. Stat test?

chisquare


Subjects r randomly assigned 2 either the control (nodrug) or the experimental (drog) group. Stat test?

ANCOVA


The researcher is interested in deptermining if the effects of the drug r different at diff levels of symptom severity (highly depressed, mod depressed, and not depressed). Stat test?

factorial ANOVA


Subjects r randomly assigned to either the control or the experimental group & scores on the BDI are converted to ranks. Stat test?

ManWhitney U


The mean score of subjects who take the drug is compared to the pop mean for depressives on the BDI. Stat test?

ttest for single sample


Subjects r assigned to 1 of 4 groups: high dosage, mod dosage, low dosage, and control. Stat test?

oneway ANOVA


Scores on the BDI r adjusted so that variability accounted 4 by the subjects' scores on a test of selfesteem is removed.

ANCOVA


The magnitude of the Fratio for a oneway ANOVA depends on the ratio between 2 sources of variance in a set of dependent variable scores. If _____ variance significantly exceeds ______ variance, then the F ratio will be high and the null hypothesis will be (rejected/retained).

between group
within group rejected 

If the ______ variance equals or exceeds _____ variance, then the F ratio will be low, and the null hypothesis will be (rejected/retained). The Fratio is a fraction with _____, a measure of ________ variance in the numerator. And this fraction has _____, a measure of _____ variance, in the denominator.

within group
between group retained MSB between group MSW within group 

In studies with more than one independent variable, a(n) _____ effect occurs when the effects of one independent variable do not generalize to all the _____ of one of the other independent variables. A _____ ANOVA provies an indication of the strength of this effect. If this effect is present, ____ effects must be interpreted with caution.

interaction
levels factorial main 

The nonparametric alternative to a ttest for independent samples is the
a. KruskalWallis B. Wilcoxon matched paird. c. MannWhitney U d. ttest for correlated samples 
MannWhitney  When a study involves a comparison of two independent groups and interval or ratio data, the ttest for independent samples would be used to compare the means of the two groups. If the assumptions of a parametric test are violated, the data would be converted to ranks and the MannWhitney would be used. MannWhitney U is the nonparametric alternative to teh ttest for independent samples.


An advantage of using a MONOVA instead of multiple oneway ANOVAs is that
a. a MANOVA is computationally simpler b. Multiple ANOVAs cannot be used when a study involves more than one dependent variable c. the probability of making a Type II error is reduced d. the probability of mkaing a Type I error is reduced. 
B. This study has one independent variable (training) with more than two levels (teacher training, computer training, no training). Thus the appropriate statistical test is the oneway ANOVA.


The use of which of the following posthoc tests results in the greatest probability of making a Type II error?
a. Tukey b. Scheffe c. Fisher's LSD d. NeumanKeuls 
Scheffe  Of all the posthoc tests, teh Scheffe is the most conservative, which means that it provides the greatest protection against a Type I error. However, since there is a tradeoff between Type I and Type II errors, this also means that its use results in the greatest probability of making a Type II error (i.e., missing an effect).


When a factorial ANOVA yields a significant main effect and a significant interaction effect,
a. the main effect should be ignored b. the main effect should b interpreted in light of the interaction effect c. the interaction effect should be ignored d. the interaction effect should be interpreted with caution 
B. Whenever both a main effect and an interaction effect exist, the main effect must be interpreted in light of the interaction effect. This is because the interaction means that the main effect does not hold true in all cases (i.e., at all levels of another independent variable).


A researcher is interested in the correlation between gener and homeownership  stat corr?

phi coefficient


A researcher is interested in the correlation between gener and scores on the BDI?

pointbiserial coefficient


a researcher is interested in the correlation between scores on the BDI and IQ scores on the WAIS. Scores of 20 or above on the Beck are reported as "depressed," whereas scores below 20 are reported as "not depressed."

biserial coefficient


A researcher is interested in the correlation between DSM diagnostic category and political party.

contingency coefficient


A researcher is interested in the correlation between motivation and scores on a prof licensing exam. She wishes to statistically remove the effects of IQ on this relationship.

partial correlation


A procedure designed to assess the causal interrelationships among three or more variables.

path analysis


A researcher is interested in using annual income in dollars to peduct scores on a measure of happiness in the elderly.

simple regression


A researcher is interested in using income, an index of support system adequacy, and an index of overall health to predict scores on a measure of happiness in the elderly.

multiple regression


A researcher is interested in the degree to which the combination of income, scores on an index of support system adequacy, and scores on an index of overall health is related to the combination fo scores on three measure of happiness.

canonical correlation


A personnel department will reject all applicants who do not demonstrate a minimum level of proficiency of five tests of aptitude.

multiple cutoff


A gambler is interested in the correlation between racehorses' finishes in their first and seconf races.

Spearman's rho


The term "least squares criterion" describes the principle that underlies
a. calculating a Pearson r correlation coefficient. b. constructing a regression line. c. determining whether multicollinearity in a multiple regression equation is significant d. conducting statistical hypothesis testing with the Pearson r. 
B. The regression line is placed at a location in the scattergram that ersults in the lowest possible sum of squared deviations of points from the line. This principle is known as the "least square criterion."


A researcher is interested in the correlation between scores on a standardized intelligence test and elementary school grades. For her research, she has access to students in a local elementary school. To obtain the highest possible correlation coefficient, she would be best advised to use
a. only high scorers on the intelligence test b. only students who score in the middle ranges on the intelligence test c. a random sample of students d. only students who are highly motivated to do their best on the IQ test 
C. A correlation coefficient will be lowered if one uses only a restricted range of scores on any of the variables involved. In other words, it is best to utilize the full range of scores, which can be obtained from a random sample of students.


When using multiple regression, a researcher would be best advised to choose predictors that
a. high a high correlation with each other and a high correlation with the criterion. b. have a low correlation with each other and a low correlation with the criterion. c. have a high correlation with each other and a low corelation with the criterion. d. have a low correlation with each other and a high correlation with the criterion. 
D. In a multiple regression equation, a migh correlation between the predictors and the criterion is necessary; otherwise, it would be impossible to use the predictors to estimate scores on the criterion. And low intercorrelations among predictors are desirable, so that the predictors are not providing redundant information.


A researcher is interested in the relationship between three predictors and a criterion. One of the predictors has a correlation of .55 with the criterion. which of the following statements is true of the multiple correlation coefficnet (multiple R) for the relationship between the three predictos and the criterion?
a. The multiple R cannot be lower than .55. Due to multicollinearity, it is possible that the multiple R is lower than .55. c. The multiple R will be higher if all the predictors are highly correlated with each other. d. The multiple R is totally unrelated to teh correlation between individual predictors and the criterion; thus, it is impossible to have any idea what the multiple R will be. 
A. A multiple correlation coefficient can be no lower than any of the individual correlations between a predictor in the equation and the criterion.


The correlation between psychosis and IQ scores would best be assessed using which of the following corrrelation coeficients?

B. To measure the correlation between an artificialdichotomy and a variable measured with interval or ratio data, one would use the biserial correlation coefficient.


Test A has a correlation of .60 with Test B and a correlation of .30 with Test C. Test A accounts for ____ as much variability in Test B as it does in Test C.
a. twice b. three times c. four times d. eight times 
To determine how much variability in one measure is explained by variability in another, one squares the correlation coefficnet. The square of .60 is .36, and .30 squared is .09. C is therefore correct because .36 is four times greater than .09.


Which of the following describes a correlation of 0.0 between "x" and "Y"?
a. The variability of Y scores at each X value is lower than the total variability of Y. b. The variability of Y scores is diff at diff levels of X. c. The variability of Y scores at each X value is equal to the total variability of Y scores d. The variability of Y scores at each X value is approximately the same. 
C. To answer this question, u have to look closely at the wording of ea choice and translate ea into everyday English. What C is saying is that the range of "Y" at every individual "X" score will be equal to the entire range of "Y." For example, let's say that scores on both "X" and "Y" can range from 1 to 10. Say that people who get a score of 1 on "X" score anywhere from 1 to 10 on "Y." And those who get a score of 2 on "X" score anywhere from 1 to 10 on "Y." And so on, for all values of "X." One's score on "X" doesn't provide any info about Y, which means that the correlation is O. If you go through the other choices and try to make sense out of them, Choice A is the converse of choice C and therefore describes a correlation that is greater than O. Choice B describes heteroscedasticity, and choice D describes homoscedasticity.


According to the central limit theorem,
a. As sample size inc, the shape of the samplind dist of means will appropach a normal shape only if the underlying pop dist is normal b. as sample size inc, the shape of a sampling dist of means will assume a normal shape, regardless of the shape of the underlying pop dist c. there will be more variability in a sampling dist of means than there will be in the underlying pop dist d. the mean of the sampling dist of means will always be an underestimate of the actual pop mean. 
B. According to the central limit theorem, the shape of a sampling distribution of means will approach normality as sample size increases. This is true regardless of the shape of the dist of the value in the underlying pop.


The standard deviation of the sampling dist of means is also known as
a. the standard error of estimate b. the standard error of measurement c. the standard error of the mean d. the standard error of the day 
c. This is the definition of the standard error of the mean.


A difference between metaanalysis and a literature review is that
a. metaanalysis involves calculation of an "effect size." b. a lit review is likely to include fewer studies than a metaanalysis c. a lit review has less ecological validity than a metaanalysis d. a metaanalysis involves a review of many diff studies in one topic area 
A. Unlike a traditional lit review, a metaanalysis involves calculation of an effect size. This allows one to estimate the overall effects, across many studies, of a particular tx or independent var.


A oneway ANOVA would be most robust if
a. the shape of the underlying pop data is skewed b. there are many levels of the independent variable c. sample size is small d. sample size is large 
D. A stat test is said to be robuse when its results tend to be accurate even in the face of mod violations of its assumptions about the pop data. The larger the sample size, the more robust stat tests tend to be, especially with regard to the normal dist of data assumption.


A researcher conducts a study using a timeseries design, consisting of a pretest phase, in which the same test is administered five time; a treatment; and a posttest phase, in which the test is admin five more times. The researcher analyzes his results by conducting a ttest comparing the combined means of the pretest and posttest phases. The main prob with this design is
a. ttests cannot be used to compare two means b. the ttest will not be powerful enough c. autocorrelation d. there is no good reason to administer the same test five times in each phase of measurement. 
C. Due to autocorrelation, standard parametric tests such as the ttest cannot be used in the analysis of time series data. Instead, one must use special techniques designed for the purpose of timeseries analysis.


In a normal dist of scores, a Tscore of 60 is approx equal to a percentile rank of
a. 60 b. 68 c. 84 d. 95 
C. T is a standard score with a mean of 50 and a standard deviation of 10. Thus, a Tscore of 60 is equal to 1 s.d. above the mean. In a normal dist, this is equivalent to the 84th percentile.


The results of an experiment indicated no significant differences at the .05 level. This means that
a. the null hypothesis is not rejected b. the null hypothesis is rejected c. the alternative hypothesis is accepted d. there was an error in calculations 
A. When the results are not significant, you do not reject the null hypothesis. That is, you cannot conclude that the IV had an effect.


3. Assuming a norm dist, how many people would score between 400 and 600 on a standardized test with a mean of 500 and a standard deviation of 100 (N=1000)?

B. First convert the scores to standard deviation units (ie, Z scores). A score of 400 is equivalent here to 1z, and a score of 600 equals +1z. Then, remember that 68% of cases fall between 1z and +1z in a norm dist. Finally, take 68% of 1,000, which is 680.


4. Which of the following correlations is the highest?
a. _.50 b. .05 c. .41 d. .23 
A. When determining which correlation is larger, you ignore the sign and just look for the bigger number


5. If two variables are positively correlated, this means that
a. as one goes up the other goes down. b. as one goes up the other stays the same c. as one goes up the other goes up d. their means are equal 
C. A positive correlation between two variables means that both move in the same direction


5. In the F ratio, withingroup variance, as measured by MSW, reflects
a. variance accounted for by random and irrelevant factors b. the difference between the sample and the population means c. variance due to the effect of the independent variable d. effect of the tx on the pop means 
A. In the F ratio, withingroup variance is error variance (in face, MSW, the index of withingroup variance, is sometimes referred to as the "error term"). This means that it measures variability due to irrelevant random factors such as preexisting individual differences between subjects.


7. For a given pop, which of the following score distributions will ikely have the least variability?
a. the pop dist b. a dist of a sample of means from the pop c. a dist of a sample of 10 scores from the pop d. a dist of a sample of 20 scores from the pop 
B. A sample of means from a population always has less variability than the pop or any one individual sample does. This is illustrated pictorially int he Appendix on Advanced Statistics.


7. For a given pop, which of the following score dist will ikely have the least variability?
a. the pop of a dist b. a dist of a sample of means from the pop c. a distribution of a sample of 10 scores from the pop d. a dist of a sample of 20 scores from the pop 
B. A sample of means from a pop always has less variability than the pop or any one individual sample.


8. The statement most true of nonparametric tests is that they
a. require data scaled on an interval or ratio basis b. are more powerful than parametric tests c. rely on pop parameters to draw conclusions about sample stats d. are used when one is not sure of the shape of the dist 
D. Unlike parametric tests, the use of nonparametric tests does not require any assumptions about the shape of the pop dist


10. In a study in which a oneway ANOVA is used, the null hypothesis would be that
a. sample variances are equal b. pop variances are equal c. sample means are equal d. pop means are equal 
D. An ANOVA is designed to test the hypothesis that group means were drawn from the sa pop; i.e., that means are equal in the pop.


An experimenter is testing the hhpothesis that there is no diff between teaching methods in regards to the grades obtained by the students on an arithmetic test. His design calls for two groups  trad teaching method vs programmed selfinstruction. He uses a ttest to analyze the data. The results are: Group 1 mean = 86; Group 2 mean = 73. The t value exceeds the tabled critical value at the .01 level for a 2tailed test. He should
a. accept the null and conclude the alternative hypothesis is false. b. reject the null and conclude the alternative hypothesis is true. c. retain the null and conclude that the alternative hypothesis is true. d. not make any interpretation, since the researcher should have used a onetailed test. 
B. If the results are significant at teh .01 level, then you reject the null and conclude that the alternative is true.


12. If a sample of 400 is taken from a pop, and you find that the mean of this ample on some standard test is 50 and the standard deviation is 10, the standard error of the mean would be
a. 20.0 b. 10.0 c. .50 d. 5.0 
C. To get this one correct, you'll need to know the formula for the standard error of the mean. The standard error of the mean equals the standard deviation of the sample divided by the square root of sample size. In this case you'd take 10 (the standard dev) and divide it by 20, and you'd get the answer of .50.


13. All of the following are true of path analysis, except
a. an a priori path is drawn connecting two or more variables in a causative direction. b. teh magnitude of the relationship between variables is determined by thier correlation coefficients c. multiple causation can be considered. d. variables are manipulated in order to confirm the direction of the path of causation. 
D. Path analysis is a method designed to determine or confirm causative relationships among variables via correlations. Hence, you wouldn't actually manipulate variables; you'd only measure their degree of relationship.


14. In a normal dist of scores, the number of cases falling between a percentile rank of 11 and 20 will be _____ the number of cases falling between a percentile rank of 41 and 50.

A. The distribution of percentile ranks is, by definition, flat. This means that the sa number of scores will fall between equal intervals. In this case, 10% of scores will fall within the ranges identified.


15. The phenomenon whereby an experimenter's expectancies influence subjects' responses on a dependent variable in the direction predicted is known as
a. the hawthorne effect b. demand char c. the carryover effect d. the Rosenthal effect 
D. It's called the Rosenthal effect bec it was first reported by Robert Rosenthal.


16. In a study that invludes one group that is tested on an intervallyscaled dependent variable before and after it receives tx, what stat test would be used to compare the obtained means?
a. ttest for single sample b. ttest for correlated samples c. ttest for independent samples d. twoway ANOVE 
B. To compare two means obtained by correlated samples (e.g., the same grou) one would use the ttest for correlated samples.


If a study such as the one described had 40 subjects, degrees of freedom would be equal to
a. 19 b. 38 c. 39 d. 78 
C. In the ttest for correlated samples, the degrees of freedom equal N1. Since there are 40 subjects, there will be40 pairs before & after of scores.


A onegroup pretest/posttest design is susceptible to many threats to internal validity, including...
a. hx b. maturation c. statistical ergresion d. all of the above 
hx, maturation, statistical regression


19. Why might the use of a factorial ANOVA be preferred over the use of separate oneway ANOVAs?
a. The use of a factorial ANOVA reduces the prob of making a TYPE II error. B. A factorial ANOVA allows one to assess for interaction effects c. A factorial ANOVA can be used when the data is interval or ratio. d. A factorial ANOVA can be used when the study involves more than two independent variables. 
B. If you have multiple independent variables, you can use either mult oneway ANOVAs or one factorial ANOVA. An advantage of the latter is that it allows you to measure interaction effects.


20. A mall owner is interested in determining whether shoppers are equally likely to use the east, north, south, and west entrances to the mall. Which of the following stat tests would be most helpful?
a. chisquare b. oneway ANOVA c. factorial ANOVA d. KruskalWallis 
A. In this case, the data will consist of frequency of observations within categories. The Chisquare


21. If the mall owner in the above question sampled 100 customers, the expected frequency in each cell under the null hypothesis would be
a. 20 b. 25 c. 50 d. more than 25 but less than 50. 
B. If the null hypothesis is true, the four entrances are used with equal freq. Thus, if 100 customers are sampled, 25 would be expected to use ea entrance.


A job applicant takes five tests. His performance is considered excellent on four of the tests but slightly inadequate on the fifth. If the procedure known as multiple cutoff were used to make hiring decisions, this company would
a. place the app in a training prog b. retest c. hire d. not hire 
D. When the mult cutoff procedure is used, an examinee must demonstrate the minimum level of proficiency on all the predictors that are administered. He is not selected.


A researcher is int in the assoc between IQ and happiness. He uses mult measures of both of these attributes. What stat analysis is the researcher likely to use?
a. mult regression b. path analysis c. canonical corr d. partial corr 
C. Canonical correlation is the appropriate stat method to correlate multiple predictors with multiple criterion measures.


24. In a study involving three groups, the variability in scores of the groups differs. The robustness of the parametric stat test used to analyze the data from this study would be enhanced if
a. alpha is set at a high level b. the grp with the most variability also has the most subjects c. the three groups have an equal sample size d. the researcher transforms data to equalize the variability of scores 
C. A stat test is said to be robust when its results tend to be accurate even in the face of moderate violations of its assumptins about the pop data. In this case, the homogeneity of variance assumption is violated. When violated, the stat test tends to remain tobust as long as the groups' sample sizes are equal.


25. All of the following statements are true of forward stepwise multiple regresion analysis, except:
a. the technique is useful in dealing with the prob of redundancy in a set of predictors b. the technique allows a researcher to add predictors one at a time until the ideal set of predictors are determined. c. the use of the procedure involves "ordering" predictors based on how much each predictor increases the multiple correlation coefficient d. the predictor with the lowest correlation with the criterion is the first one retained for the final mult regression equation. 
D. Forward stepwise regression is a technique that allows a researcher to choose a smaller set of predictors out of a larger subset. When the technique is used, the predictor with the highest correlation with the criterion is the first one retained for the final equation. Choices A, B, and C are true statements about forward stepwise regression.


26. Bayes' theorem is associated with
a. sample size and inferential stats b. conditional prob and base rates c. the normality assumption in the central limit theorem d. metaanalysis and effect sizes 
Bayes' theorem is used to revise conditional probabilities based on base rates  B


27. A tscore of 70 corresponds to
a. the 70th percentile b. the 90th percentile c. the 98th percentile d. 3 standard deviations above the mean 
c. A Tscore of 70 is two standard deviations above the mean (the mean of a Tscore distribution is 50; standard deviation is 10). When any score is two standard deviations above the mean, 98 percent of the dist is below that score. In this case, 98 percent of the scores is below a Tscore of 70, in other words, the 98th percentile.


1. A psychological tst can be devined as a(n) _____ and _____ measure of behavior.

objective
standardized 

The process of _____ involves ensuring uniformity of administration and scoring of the test. This proces includes obtaining ______, which represent the score of a larger representative sample of the pop for which the test is intended.

standardization
norms norms 

Interpreting a test score by comparing it to _____ allows us to determine how a given score by comparing it to others of the same pop who have taken the test.

norms


A good test will be ______, which means that it will provide repeatable, consistent results. It will also be _____, which means that it will measure what it purports to measure.

reliable; valid


A(n) _____ test is one in which the examinee's response rate is assessed. A(n) _____ test is one that assesses the level of difficulty an examinee can attain.

speed; power


A9n) _____ test uses the examinee himself as the frame of reference in score interpretation. It indicates which attributes are weakest and strongest within the individual.

ipsative


A9n) ____ effect occurs when a test is unusually difficult, and many testtakers score at or near the bottom of the scale.

floor


The defining characteristic of an objective test is
a. the existence of norms b. a standardized set of scoring and administration procedures c. examiner discretion in scoring and interpreting items. d. reliability and validity 
B. An objective test is one that is independent of the subjective judgment of the particular examiner. This means that administration and scoring procedures are uniform, or the same for all examiners.


A test developer administers an intelligence test to a group of examinees on oct. lst and then administers the same test to the same group of examinees on nov. lst. Most likely the examiner is interested in
a. assessing the test's reliability. b. assessing the test's validity. c. determining whether or not the test is vulnerable to the effects of reponse sets d. double filling his funding source. 
A A test is reliable if it provides repeatable, consistent results. Giving the sa test to the sa group of examinees at diff points in times is one way to assess a test's reliability.


A drawback of normreferenced interpretation is that
a. a person's performance is compared to the performance of other examinees b. it does not permit comparisons of individual examinees' score on diff tests c. it does not indicate where the examinee stands in relation to others of the sa pop d. it does not provide absolute standards of performance. 
D. normreferenced interpretation involves comparing an examinee's score to the scores of others who have taken the same test. A drawback of this type of interpretation is that it does not provide abosolute standards of good or poor performance  the examinee's score must be interpreted in light of the performance of the norm group as a whole.


a. According to classical test theory, an examinee's obtained test score consists of two components: ______, or the portion of variability among examinees that is due to whatever attribute is being measured by the test, and ____, or the portion of variance due to factors that are irrlevant to whatever is being measured.

truth (or true score vaiance)
error (or measurement error, or error variance) 

______ by definition, is _________, which means that it is due to factors that affect different examinees in different ways.

error; random


If a test is ______, it will be free from ______ and yield information about examinees' _____.

reliable
error true scores 

2. The reliability coefficient, unlike other correlation coefficients, is interpreted ______. This means, e.g., that for a test with a reliability coefficient of .70, _____% of observed score variance is true variance. In other words, unlike as with other correlation coefficients, you never ______ the reliability coefficnet in order to interpret it.

directly
70 square 

3. Obtaining a(n) ____ reliability corefficient involves administering the same test to the same group of people, and then correlating scores on the first and second administrations.

testretest


The sources of measurement error for this type of reliability include factors related to _____.

the passage of time


This coefficient is not appropriate to use for test that measure ______ and those on which scores are affected by ______.

unstable attributes
repeated administration 

4. Obtaining a(n) ______ reliability coefficient involves administering two forms of a test to the same group of examinnes, and then obtaining the correlation between the two sets of scores. Sources of measurement error for this reliability coefficnet usually include factors related to both _____ and different _____ on the two forms.

alternate forms
the passage of time content 

5. There are a number of measures of _____ reliability, all of which indicate the magnitude of correlation among individual items.

internal consistence


For instance, obtaining a(n) ____ reliability coefficient involves dividing a test in two and obtaining a correlation between the halves as if they were two shorter tests. When this coefficient is used, the ____ is usually used to correct for the effects of shortening the test on the reliability coefficient. There are also two measure of the average degree of interitem consistency: the ________, which is used when items are dichotomously scored, and ______, which is used when items are not dichotomously scored. All three of these coefficients are inappropriate for _____.

splithalf
spearmanBrown Formula; KuderRichardson Formula coefficient alpha speed tests 

The _____ is used to construct confidence intervals that indicate the range in which an examinee's _____ test score is likely to fall, given his _____.

standard error of measurement
true obtained test score 

For example, there is a _____% probability that the examinee's _____ score lies within one ______ of the _______ score, and a ____% probability that the examinee's _____ score lies within approx two _____ of the _____ scores.

68
true standard error of measurement obtained 95 true standard error of measurements obtained 

All other things being equal, a short test will have a(n) ____ reliability coefficient than a longer test, a fillintheblank test will have a(n) _____ reliability coefficient than a true/falst test, and a very easy test will have a(n) ______ reliability coefficient than a moderately difficult test.

lower;
higher; lower 

1. You would not use the KuderRichardson Formula 20 to assess the reliability of a
a. test that is dochotomously scored b. test that measures an unstable attribute c. speed test d. psychological test 
C. Internal consistency reliability coefficients (e.g., KR20, coefficient alpha, aplithalf) should nto be used to assess the reliatbility of speed tests. This is because on a speed test, all attempted items are expected to be answered correctly; thus, any coefficient of internal consistency will yield a spuriously high estimate of the test's reliability.


One way to improve the interrater reliability of a bx observation scale would be to use
a. mutually exclusive rating categores b. nonexhaustive rating categories c. highly valid rating categories d. empirically derived rating categories 
A. Interrater reliability is strengthened when mutually exclusive and exhaustive rating categories are used. This means that categories are clearly enough defined so that no bx will belong under overlapping categories (mutually exclusive), and that all observed behaviors can be placed into a category (exhaustive).


The standard error of measurement is
a. inversely related to the reliability coefficient and inversely related to the stand deviation of test scores. b. positively related to the reliability coefficient and positively related to the standard deviation of test scores c. positively related to the reliability coefficient and inversely related to the standard deviation of test scores d. inversely related to the reliability coefficient and positively related to the standard deviation of test scores 
D. This means that the standard error of measurement increases as reliability decreases adn the standard deviation increases. This can be seen from the formula for the standard error of measurment.


When practical, it is most advisable to use a(n)
a. alternateforms reliability coefficient b. testretest reliabililty coefficient c. internal consistency reliability coefficient d. interscorer reliability coefficient 
A. Although this opinion is not universally shared, it is what many experts believe. The words "when practical" are a good clue, since it is often very impractical to obtain an alternate forms reliability coefficient.


According to classical test theory, an observed test score relects
a. true score variance plus systematic error variance b. true score variance plus random error variance c. true score variance plus random and systematic error variance d. true score variance only 
B. According to classical test theory, a given test score reflects both "truth" (whatever is being measured by the test) and measurement error (factors that are irrelevant to whatever the tset is measuring). Measurement error, which occurs because no test is perfectly reliabile, is random by definition.


Which of the following methods of recording gx is most usefly when the target bx has no fixed beginning or end?
a. interval b. continuous frequency d. duration 
A. In interval recording, a rater records whether or not an individual is engaging in a target bx during a given interval. During this interval, the rater only has to decide if the behavior is occurring, not when it begins or when it ends. This is why interval recording is the best method of recording behaviors that have no fixed beginning or end.


A test has content validity if it __________.

adequately samples the content domain it is supposed to measure knowledge of;


Content validity is a concern when ___________ tests are being developed.

educational (or achievement, or work sample)


To determine if a test has content validity, we rely primarily on ________.

expert judgment


If a test has criterionrelated validity, there would be a high ______ between the _____ and the _____.

correlation; predictor; criterion.


A(n) ______ measure is a direct and independent measure of that which the predictor test is designed to predict; it can be thought of as that which is being predicted. For example, if an industrial psychologist were interested in using scores on an aptitude test to predict job peformance, the aptitude test would be the _______ and a measure of job performance would be the _______.

criterion; predictor; criterion


When _________ validation procedures are used to validate a predictor test, predictor and criterion data are collected at or about the same time.

Concurrent


When _______ validation procedures are used, predictor data is collected first, and criterion data are collected at a future point.

predictive


The former type of validation is more appropriate for predictors that measure _____; the latter type is more appropriate for test designed to measure _______

current status on a criterion
future status on a criterion 

Since ______ validation is less costly than _______ validation, the former is often used as a substitute for the latter.

concurrent
predictive 

The ______ is a statistic used to contruct a range in which an examinee's ______ criterion score is likely to fall, given his or her ________ criterion score.

standard error of estimate
actual (or true) predicted 

Say a person takes a short aptitude test that is being used as a predictor of IQ score. Say that on the basis of his score on the aptitude test, his IQ score is predicted to be 100. If the ____ were equal to 5, there would be a 68% probability that his ____ intelligence is between ____ and ____.

95
actual predicted standard error of estimate standard error of estimate 

And there would be about a 95% probability that his _____ intelligence is between _____ and _____.

actual
90; 100 

Often, a predictor is used for classification purposes. ie., to predict to which of two _____ groups a person belongs. When this is the case, the predictor is administered to examinees, and those scoring above the predictor ____ would be expected to score above the _____ on the criterion.

criterion
cutoff cutoff 

For example, a job selection test might be used to predict whether or not a person will be successful at a particular ocupation. Individuals who are predicted to be successful by the test and in fact do turn out to be successful would be called ______.

true positives


And those whom the predictor correctly identified as unsuccessful would be called ______.

true negatives


Those who are classified by the test into the unsuccessful group but turn out to be successful on the job would be called______.

false negatives


And finally, those whome the test predicts to be successful but in fact turn out to be unsuccessful would be called _____.

false positives


A validity coefficient would be lowered if there was a(n)____ range of scores on either the ____ or the ______.

restricted
predictor criterion 

7. After construcing & validating a test, a test developer wil likely want to revalidate it using a second sample of individuals. This process referred to as ______. In such cases, the validity coefficient obtained on the second sample is likely to be ______ than the one obtained from the first sample. This phenomenon is known as _____.

crossvalidation; lower; shrinkage


8. A test has ______ when its validity coefficient for one subgroup is higher than its coefficient for another subgroup. For ex, an IQ test may be a valid predictor of job performance for whites, but a completely invalid predictor of performance for blacks. In this ex, race would be said to be acting as a(n)________.

differential validity;
moderator variable 

Costruct validity is a concern in developing tests that measure ______.

hy;othetical constructs or traits


Two types of construct validity are the following 1) ________ validity, which is present when a test has a(n) _______ correlation with another test that measures the same trait, and 2) _____ validity, which is present when a test has a(n) ______ correlation with another test that measures a different trait.

convergent
high discriminant (or divergent) low 

A(n) ______ matrix provides a method of assessing the construct validity of two or more tests. On this matrix, if the ______ coefficient (the correlation between two test which measure the same construct using different methods) is ______, evidence of ______ is provided.

multitraitmultimethod
monotraitheteromethod high convergent 

And if the ______ coefficient (the correlation between two tests using the same method to measure different constructs) is _____, evidence of _____ validity is provided.

heterotraitmonomethod
low discriminant (or divergent) 

______ is a procedure designed to determine the degree to which a large set of variables or test are measuring the same underlying construct or constructs.

Facotor analysis


The proecedure yeilds a(n) _____, which indicates each test's correlation with each factor identified in the anlysis (a correlation between a test and a factor is referred to as a(n) _____.)

factor matrix
factor loading 

To facilitate interpretation of a factor analysis, a(n) ______ is usually performed, and there are two types: _____ and ______.

rotation
orthogonal oblique 

When a(n) ______ is conducted, uncorrelated factors are derived, and when a(n) _____ is conducted, correlated factors are derived.

orthogonal rotation
oblique rotation 

If, in a factor analysis, factors are ______, the ______ of a test can be obtained by squaring and summing the ______.

orthogonalcommonality
factor loadings 

For example, imagine a factor analysis of six tests which yeilded two significant factors. Imagine that Test A has a .60 correlation with Factor I and a .20 correlation with Factor II. By squaring and summing these ______ (assuming the rotation is ______), we can determine that the communality of Test A is ____. This means that _____% of the variability in Test A is explained by _______.

factor loadings
orthogonal .40 the two factors 

If a test is highly reliable it (will be/may be/will not be) valid. If a test is very valid, it (will be/may be/will not be) reliable.

may be
will be 

In other words, reliability is a(n) ______ but not a(n) ______ condition for validity to present. The ____ formula would be used to determine how ____ a test would be if it had _____ reliability.

necessary
sufficient correction for attenuation valid perfect 

1. Which of the following is the lowest validity coefficient?
a. .80 b. .50 c. .10 d .15 
C. Like any other correlation coefficient, the magnitude of the validity coefficient is determined by its absolute value rather than its direction (i.e., positive or negative). To answer this question, look at the numbers and ignore any negatie signs. Since .10 is the lowerst number, it is, of the choices listed, the lowest validity coeffient.


2. If an indistrial psychologist were concerned about reducing the number of false positives yielded by a job selection test, he could
a. raise the predictor cutoff score and/or raise the criterion cutoff score. b. raise the predictor cutoff score and/or lower the criterion cotoff score. c. lower the predictor cutoff score and/or lower the criterion cutoff score. d. lower the predictor cutoff score and/or raise the criterion cutoff score. 
B. False positives can be reduced by raising the predictor cutoff score. If teh selection test becomes more difficult to succeed on, there will be fewer individuals who "pass," and those who do pass are less likely to be unqualified. Lowering the criterion cutoff score will also result in fewer false positives. Lwoering the criterion cutoff is equivalent to relaxing the definition of acceptable performance. This means that it will be easier to be considered adequate; therefore, those who do "pass" the selection test will be more likely to be able to meet this easier criterion standard.


5. Some would argue that, in conducting a factor analysis, an oblique rotation is usually preferable to an orthogonal rotation because
a. few factors are uncorrelated b. most factor analyses identify distinct and unrelated traits c. oblique rotations involve simpler calculational procedures d. oblique rotations are more fun 
A. By definition, an oblique rotation produces correlated factors. In other words, if you believe that the traits represented by the factors are correlated, it makes theoretical sense to use an oblique rotation. And if you believe that few factors or traits are ever uncorrelated, you might argue that oblique rotations should always be used.


If a test has a reliability coefficient of .90, we can conclude that
a. the highest validity coefficient the test could have is .81. b. its validity coefficient is equal to teh square root of .90. c. the test is probably very valid. d. the test may or may not be valid. 
D. Knowing that the test's reliabilty coefficnet is .90, tells us that teh upper limit of the validity coefficient is the square root of .90 (not .81, which is the square of .90). This means that the test's validity is lower than or equal to the square root of .90. The test may be highly valid, mod valid, or completely invalid.


If a test's validity coefficient were 1.0, the standard error of estimate would be equal to
a. 0.0 b. 1.0 c. the standard deviation of criterion scores d. cannot be determined 
A. This makes sense. If a test has perfect validity (a validity coefficient of 1.0 or 1.0), there is no error of estimate, or no error when the test is used to predict score on a criterion measure. The anser can also be derived through the formula for the standard error of estimate. Using this formula, you can see that a validity coefficient of 1.0 will always result in a standard error of estimate of 0.


9. Criterion contamination has the effect of
a. increasing the validity coefficient b. decreasing teh validity coefficient c. increasing examinees' criterion score d. decreasing examinees' criterion scores 
A. Criterion contamination occurs when raters assigning criterion scores have knowledge of the ratees' predictor scores, adn their knowledge affects scores on the criterion. If a supervisor knows that an employee got a low score on a predictor, he might rate the employee lower on the criterion than he normally would have. This results in an artificially high consistency between predictor and criterion scores and inflates the validity coefficient.


10. Fjollowing a prinicipal components analysis of a set of variables, four eigenvectors, symbolized in order as v1, v2, v3, and v4, are derived. Which of the tollowing statements is true?
a. v1 will account for more variance in the variables than any of the other eigenvectors. b. v1 will account for more variance in the variables than all the other eigenvectors combined c. v1, v2, v3, and v4 will each account for the same amount of variance in the variables. d. v4 will account for more variance in the variables than any other eigenvector 
A. In a principal components analysis, eigenvectors (which are also called factors or principal components) represent underlying traits or constructs that are being measured by some or all ofthe variables being analyzed. In principal components analysis, the first factor accounts for high percentage of variance than any ot\f the other factors. This just means that the variables in the analysis measure the first factor more than they measure any of the other factors.


By definition, an oblique rotation produces ________ factors.

Correlated


If you believe that the traits represented by the factors are correlated, it makes sense to use an ________.

oblique rotation


If you believe that few factors or traits are ever uncorrelated, you might orgue that _______ ______ should always be used.

oblique rotations


A ________ _______ is an examinee who is identified by a predictor as not meeting a criterion but, in reality, does meet it.

false negative


Usually, when a test is developed, a large pool of _____ are written, and _____ is used to determine which items will be retained for the final version of the test.

items
item analysis 

A test item's difficulty level (p) is equal to the ________.

percentage of examinees who answer the item correctly.


On most tests, the optimal average difficulty level is ______; this level is associated with maximum ____ and ______.

.50
reliability differentiation or discriminability (score variability would also be correct) 

However, the optimal difficulty level depends on _______. For example, if the test is designed to select only a few highly qualified individuals, one should set the average p value at a relatively (high/low) level. It's imp to remember that the higher the p value, the (more difficult/less difficult) the item.

purpose of testing (or the probability that items can be guessed)
low less difficult 

A test item's discrimination refers to the degree to which the item ______.

differentiates among examinees in terms of what the test measures;


One way to assess an item's discrimination is to correlate each item with either ______ or _______.

the total test score
an external criterion 

An item's discrimination index (D) is equal to the percentage of _____ (U) minus the percentage of ________ (L); a value of ______ represents maximum discriminability.

examinees in the highscoring group
examinees in the lowscoring group 100 or 100 

Higher levels of discrimination are associated with _____ levels of difficulty.

moderate


The item difficulty level associated with the maximum level of differentiation among examinees is
1. .10 b. .50 c. .75 d. 1.0 
b. An item is most likely to differentiate among examinees (e.g., between high and low scorers) when half the examinees answer it correctly and half answer it incorrectly. The item difficult level (p) of .50.


The optimal average item difficulty level for a truefalst test would be
a. .10 b.50 c. .75 d. 1.0 
C. For most tests, the optimal item difficulty level is .50. Hwoever, the optimal difficulty level is affected by the probability that examinenees can select the correct answer by chance alone. When considering the effects of chance, the ruleofthumb is that the average difficulty level of test items should be about halfway between 1.0 and the level of success expected by chance alone. On a truefalst test, the probability of getting an item correct by chance alone is 50%. Therefore the optimal item difficulty level would be midway between 50% and 100%, or 75%``


A test item's difficulty level is most affected by
a. thes test's length b. the test's validity c. the natuer of the testing process d. the characteristics of the individuals taking the test. 
D. A test item's difficulty is measured in terms of the percentage of examinees who answer the item correctly. Therefore, the characteristics of the individuals taking the test will influence the observed difficulty level. For ex., if all examinees taking an intelligence test are highly gifted, the difficulty index (p) for items will be inflated. That is, test items will be estimated to be easier than they actually are.


Which of the following statements is least true of item response theory?
a. It is based on the notion that items analyzed measure a latent trait such as cognitive ability. b. It allows for the ability levels of diff groups of people to be compared, even if the groups are tested using a diff set of items. c. it applies best to large samples d. It is based on teh notion that the characteristics of an item will be different depending on the characteristics of the sample of individuals tested. 
d. One assumption of item response theory is that item parameters (characteristics of items such as diff level and discrimination) will be the same regardless of the sample of individuals taking the test. The other statements are true of item reponse theory.


An examinee's _______ test score is not that meaningful unless a frame of reference is provided for score interpretation.

raw


Two types of scores which provide this frame of reference are _______ scores, which provide a comparison of an examinee's score to that of others who have taken the same test, and ______ scores, which provide a comparison to an external, preestablished standard of performance.

normreferenced;
criterionreferenced 

2. Normreferenced scores include ______ scores, which indicate how far along the normal path of development an examinee is.

developmental


A(n) _____ IQ score is an example of such a score.

ratio


They also include withingroup norms such as _______, which indicate the percentage of scores that fall below a given raw score, and _______, which indicate where a given score stands, in standard deviation units, in relation to the mean.

percentile ranks
standard scores 

There are a number of different types of ______, including zscores, _______, _______, and ______, all of which provide essentially the same information.

standard scores
Tscores stanines deviation IQ scores 

1. Percentile ranks and Tscores have which of the following in common?
a. They are both standard scores b. They are both normreferenced scores. c. They are both developmental scores. d. They are both criterionreferenced scores. 
B. Percentile ranks and Tscores are both normreferenced scores; that is, they are both interpreted in terms of a comparison to the scores of those in a normative group. A Tscore (but not a percentile rank) is also a standard score, which is a normreferenced score that is interpreted in terms of distance, in standard deviation units, from the mean of a normative group.


You work as an assistant for a psychology professor at a university and have administered and scored a midterm exam for him. You report students' score on the exam as zscores. The prof tells you that "This makes no sense; I'm used to the MMPI and I can only understand Tscore." In this case, you should explain to the prof that
a. The test will have to be readministered so that Tscores can be obtained. b. converting Tscores to zscores will require complex calculational procedures c. zscores and Tscores yield essentially the same information d. Anyone who can't understand zscores...... 
C. Tscore and Zscore are both standard scores, which means they are both interpreted in terms of distance from the mean in standard deviation units. For example, a zscore of 1.0 and a Tscore of 60 are equivalent  they both indicate that teh score is one standard deviation unit above the mean.


3. The formula "XM/s.d." is the formula for a
a. standard score b. percentile rank c. criterion referenced score d. Tscore 
A. This is the formula for a zscore, whci is a typeof standard score.


4. The advantage of a deviation IQ score, as compared to a ratio IQ score is that it
a. provides an index of an examinee's absolute level of intelligence b. indicates an examinee's mental age c. alllows scores of individuals who are the sa age to be compared d. allows score comparisons to be made across age levels 
4. D. A deviation IQ score is a standard score, which means that it tells you how many standard deviation units an examinee's score falls above or below the mean. An advantage of a standard score is that scores of individuals from different populations (and on diff tests) can be compared. A 9 year old's deviation IQ score can be meaningfuly compared to that of a 30 year old.


1. Decreasing a test's interitem consistency makes a test
a. less valid b. less reliable c. more valid d. more reliable 
B. One measure of reliability of a test is how homogeneous or internally consistent items are (as measured by coefficient alpha or the KuderRichardson Formula) Therefore, decreaseing iteritem consistency makes a test less reliable.


4. In the validation of a selection test for graduate school entrance, the highest validity would be shown if scores on the test were correlated with actual school grade of
a. the lowest scores b. only the middle range of scores c. all those admitted d. only the highest scores 
C. Any correlation coefficient, including a validity coefficient, will be lowered when there is a restriction in the range of scores in one or both variables. Choice C would provide the highest range of scores and givee the highest validity coefficient.


5. Which of the following statements is true about the relationship between reliability and validity?
a. A valid test will always be reliable b. A reliable test will always be valid c. Teh validity coefficient sets a ceiling on the reliability coefficient d. The validity coefficient is equal to the square root of the reliability coefficient 
A. Only a is true  for a test to be valid, it must be reliable; however the opposite is not true  a test can be reliable without being valid. The reliability coefficient sets a ceiling on the validity coefficient. The upper limit of the validity coefficient is equal to the square root of the reliability coefficient.


6. If you want to determine the degree to which an obtained test score is likely to deviate from teh true test score, you would use
a. the standard error of estimate b. the standard error of measurement c. teh standard error of the mean d. teh standard error of judgment 
B. According to classical test theory, any given score on a test reflects both true score variance (the actual char being measured by the test) and random error. In other words, an obtianed test score is likely to differ from the true test score to a degree that depends on how mych error the test contains. The standard error of measurement is used to construct a range in which an examinee's true test score is likely to fall, given her obtained test score.


7.If a validity coefficient is equal to 0.0, the standard error of estiamte will be equal to
a. 0.0 b. the validity coefficient c. the standard deviation of predictor test scores d. the standard deviation of criterion scores 
D. You can swer this question by using the formula for the standard error of estimate. Using this formula, u can c that if the validity coefficient is 0, the standard error of estimate comes out to be equal to the standard deviation of criterion scores.


8. Which of the following is the highest score?
a. the zscore = 2.0 b. Tscore = 75 c. WAIS score = 120 d. stanine score = 7 
B. To answer this question, u need to convert each choice to a common metric. Since all choices are standard scores, they can easily b converted to zscores, which direclty indicat how many standard deviation units above or below the mean the score falls. Since Tscore have a mean of 50 and a standard deviation of 10, a Tscore of 75 is equivalent to a zscore of 2.5. WAIS scores have a mean of 100 and a standard deviation of 15; therefore a score of 120 is equivalent to a zscore of just over 1.0 (1.33 to be exact). And a stanine has a mean of 5 and a standard deviation of about 2; thus a stanine of 7 is equivalent to a zscore of about 1.0. In other words, a Tscore of 75 represents the highest score.


9. If as a test developer, you wish to maximize a test's reliability coefficient, you would set the test's average item difficulty level at _____. On the other hand, if you were interested in developing a test to select only very highly qualified individuals for a job, the optimal average item difficulty level would be around _____.
a. .50; .15 b. .50; .80 c. .25; .80 d. .75; .25 
A. If the average item difficulty level is moderate (around .50) which means that, on the average, half the examinees answer the items correctly), score variability and therefore reliability will be increased. However, if the goal of testing is to have only highly qualified individuals pass, one would want to make the questions fairly difficult. An item difficulty level of .15 means that only 15% of the examinees answer items correctly. This would be an appropriate average difficulty level if one were attemtping to select only very qualified individuals.


10. A multitraitmultimethod matrix would most likely be used to assess
a. concurrent and predictive validity b. content validity c. face validity d. discriminant and convergent validity 
D. Constructing a multitraitmultimethod matrix allows one to assess the convergent and discriminant validity of two or more tests. Convergent and discriminant validity are types of construct validity.


11. A test developer conducts a thorough job analysis as part of the process of developing a work sample test that will be used as a sselection tool. The job analysis reflects the developer's concern with
a. content validity b. criterionrelated validity c. construct validity d. face validity 
A. a job analysis is conducted to determine what tasks a job consists of and what typesof skills are needed to perform the job well. Job analyses are sometimes used as part of the process of constructing work sample tests, which must have good content validity, ie.., they must provide a presentative sample of the tasks involved on a job.


The diff between coefficient alpha and the KuderRichardson Formula 20 is that
a. coefficient alpha is an internal consistency reliability coefficient b. KR20 provides an index of interitem consistency c. KR20 is used when items are scored dichotomously d. coefficient alpha is the probability of making a Type I error 
C. Both KR20 and coefficient alpha are internal consistency reliability coefficients, and both provide an index of a test's average degree of interitem consistency. KR20 is used when the test is dichotomously scored (e.g., yes/no, right/wrong), whereas coefficient alpha is used when the test isnot dichotomously scored.


13. The testretest reliability coefficient would be inappropriate for all of the following, except
a. a test of day to day fluctuations in mood b. a short math test on which examinees are able to improve through practice d. a test designed to screen for Brief Psychotic Disorder d. a speeded test 
D. A testretest coefficient should not be used to assess the reliability of test that measure unstable traits, such as mood or Brief Psychotic Disorder. it should also not be used for tests on which scores are affected by practice or repeated administration. It can, however, be used for a speed test.


14. One hundred job applicants are given a job selection test. Of the 60 people who are selected for the job on the basis of their results on this test, 10 turn out to be unsuccessful on the job. Of the 40 who are not hired on the basis of their test results, 15 would have been successful if they had been hired. In this case, there were how many false positives?
a. 10 b. 15 c. 25 d. 50 
A. In this situation, a false positive is someone who succeeds on the predictor but does not turn out to be successful on the job. There are 10 such individuals.


15. Upon crossvalidation of a selection test, shrinkage usually occurs because
a. the test developer does not include enough items in the original item poor b. the chance factors present in the original validation sample are not likely to be present in the crossvalidation sample c. the characteristics of individuals in the original validation sample are too heterogeneous d. validity coefficients always shrink in the wash, esp when they are made of cottom 
B. When a predictor is validated on a second sample (i.e., upon crossvalidation), the validity coefficient obtained will likely be lower than the one originally obtained on the first sample. This is known as shrinkage. Shrinkage occurs becuase chance factors that increase the validity coefficient in thefirst sample are not present in the second sample.


16. To ensure adequate interscorer reliability, it is imp to use bx ovservation scales that
a. have overlapping rating categories b. give raters flexibility in scoring behaviors d. are very valid for predicting future behaviors d. have mutually exclusive and exhaustive rating categories 
D. To ensure interscorer reliability, it is best to have mutually exclusive and exhaustive rating categories. This means that all behaviors fit into one and only one category.


The technique that would be most helpful in identifying a typology of substance abuses is
a. discriminant function analysis b. factor analysis c. cluster analysis d. crossvalidation 
C. Cluster analysis is a technique used to develop a taxonomy, or classification system, of objects or people. For instance, if you wanted to identify subtypes of substance abusers, you could use a cluster analysis to do so.


18. In principal axis factor analysis, an eigen value indicates
a. the correlation between a test and a factor b. the correlation between two factors identified by the analysis c. the total amount of variability in a test accounted for by all the factors in the analysis d. the total amount of variability in all the tests accounted for by an unrotated factor. 
18. D. In factor analysis, an eigenvalue is a statistic that measures the explanatory power of a factor. More technically, it indicates how much variability is accounted for by an individual factor. Choice A describes a factor loading; choice C describes communality; and there isn't a part name for what is described by choice B. Don't be thrown by the term "prinicipal axis factor analysis"  on the exam, you won't have to distinguish between diff types of factor analysis.


The central limit theorem predicts that a sampling distribution of means will increasingly approach a normal shape:
a. regardless of the shape of the pop distribution as the sample size increases b. regardless of the shape of the po dist as the number of samples increases c. only when the pop dist does not deviate from the norm d. only when the sample dist do not deviate significantly from the normal 
a. regardless of the shape of the pop dist as the sample size increases


The probability of making a Type 1 error is increased by:
a. conducting a single multivariate test rather than several univariate tests b. changing the level of significance from .01 to .05 c. changing beta from .01 to .05 d. conducting a twotailed rather than a onetailed test 
b. changing the level of significance from .01 to .05


An investigator wants to test the hypothesis that the average number of aggressive acts that children exhibit in an unfamiliar situation jis related to gender and sociability. He obtains a sample of 30 boys and 30 girls who have been rated as either sociable or shy and then has observers count the number of aggressive acts each child exhibits in an unfamiliar situation during a 30 minute play period. The best stat test to analyze the data the investigator colelcts in this study is which of the following:
a. ttest for independent samples b. chisquare test c. oneway ANOVA d. twoway ANOVE 
d. twoway ANOVA  two independent variables (sociability and gender) and a singld dependent variable that is measured on a ratio scale (number of aggressive acts).


Use of the chisquare test to analyze data collected from a study making use of a singlegroup timeseries design:
a. is contraindicated because the test's statistical power will be reduced b. is contraindicated because the test's assumption of independence has been violated c. is contraindicated because the test's assumption of homogeneity has been violated d. 
b. in contraindicated because the test's assumption of independence has been violated


The splitplot ANOVA is appropriate when:
a. a study has two or more IVs but not all subjects receive all combinations of the IVs b. a study has at least one betweengroups IV and one withinsubjects IV c. one of the IVs included in a study is an extraneous variable 
b. a study has at least one betweengroups IV and one withinsubjects IV


Trend analysis
a. cannot be used if the IV is measured on an interval or ratio scale b. cannot be used when intervals between adjacent points are unequal c. can be used whether the relationship between variables is linear or nonlinear d. is useful when the IVs are highly correlated 
c. can be used whether the relationship between variables is linear or nonlinear  trand analysis is a type of interential statistic that enables a researcher to determine if a quantitative independent variables has a linear, quadratic, cubic, or quartic effect on the dependent variable


The appropriate correlation coefficient when the variables of interst are both measured on a continuous scale and have a Ushaped relationship is:
a. eta b. biserial c. phi d. Spearman 
a. eta  appropriate correlation coefficient when the relationship between two variables is nonlinear


A linear structural relations analysis (LISREL) is useful for:
a. developing a causal model for the relationships among a set of variables b. testing the veracity of a causal model for the relationships among a set of observed variables c. testing the veracity of a causal model for the relationships among a set of observed and latent variables d. identifying the linear combination of two sets of variables that produces the highest correlation coefficient 
c. testing the varacity of a causal model for the relationships among a set of observed and latent variables  LISREL, like path analysis, is a structural equation modeling technique that is used to test causal hypotheses or models about teh relationships among a set of variables. In contrast to path analysis, LISREL takes into account both observed (measured) variables and the constructs (latent variables) they are believed to measure


Counterbalancing is useful for controlling
a. demand characteristics b. selection biases c. history effects d. order effects 
d. order effects  counterbalancing is used in repeated measures designs in which each subject will receive all levels of the independent variables. It helps control order effects (mjultiple tx interference) by administering the levels in different orders for different subjects


When using a singlegroup pretest/posttest design, reducing the interval of time between administration of the pre and posttest will most likely reduce which of the following threats to internal validity:
a. hx b. selection c. maturation d. testing 
c. maturation  maturation refers to changes that occur within subjects simply as the result of the passage of time. REducing the length of time between pretesting and posttesting helps reduce its effects


When using an ABAB design, you are:
a. administering two diff tx at two diff times b. administering one treatment at two diff times c. administering one tx to two diff groups d. administering one tx to two diff tx 
b. administering one tx at two diff times. The ABAB design has two no treatment (a) phases and two tx (B) phases. The same treatment is administered during the B phases.


You are designing a study to investigate highrisk sexual behavior among urban adolescents. When selecting your sample, you make sure it includes proportions of whites, Hispanics, and AA comparable to their population proportions. This will help ensure that your study has adequate:
a. incremental validity b. discriminant validity c. internal validity d. external validity 
d. external validity


School consultation would most likely be targeted at
a. the school superintendent b. students c. teachers d. psychologists working with children 
c. teachers  more effective when targeted at individuals who work with students rather than the students themselves.... parents too.


The case of larry P. V. Riles dealt with
a. the use of aptitude tests in the classroom placement of minority children b. the rights of parents to access school records c. the rights of handicapped children to have an individual education plan (IEP) constructed d. the privacy rights of students 
A. parents claimed racial bias... the ca court banned the use of IQ tests as a criterion for placement of children in EMR classes


In many schools, children are placed into classes with other childen with a similar ability level. Research suggests that this practice
a. is associated with positive effects on learning ecause it allows children to proceed at their own pace. b. is associated with positive psych effects on children's learning because it increases their sense of identity c. is for low achieving children, associated with negative effects on academic achievement as well as negative effects on selfesteem and motivation 
d. this practice, inown as ability tracking, has negative psych and academic effects on lowachieving children, as well as on moderateachieving children


wHILE MALES AND FEMALES ARE GENERALLY EQUAL IN TERMS OF INTELLIGENCE, ONE AREA IN WHICH THEY HAVE CONSISTENTLY BEEN FOUND TO DIFFER IS THA
A. MALE VERBAL SKILLS R BETTER B. FEMALE VERBAL SKILLS R BETTER C. FEMALE VERBAL sat SCORES ARE MUCH HIGHER D. MALE MATHEMATICAL SKILLS R BETTER 
B. FEMALE VERBAL SKILLS ARE BETTER


Research investigating the relationship between the race of teh examiner and AA children's scores on the WISC has shown that
a. AA children do better if the examiner is also AA b. AA children do better if the examiner is Caucasian c. AA children do better if the examiner is also AA d. the race of the examiner is not related to the child's performance 
d. the race is not related to the performance


It would be least appropriate to administer the performance subtests only of the WISCIII to
a. readingdiabled children b. suburan middle class children c. immigrant, non English speaking children 
b. surbab middle class children


Which of the following WAIS subtests is the best measure of shortterm memory?
a. picture attangement b. icture completion c. block design d. digit span 
duh


If u needed an approx of the level of cognitive functioning of a 12yo girl who was hearing impaired, an appropriate test to use for the initial screening would be the
a. stanford binet b. peabody picture vocabulary test c. peabody individual achievement test d. leiter international perf scale 
d. leiter international performance scale


The correlation between a child's and her parent's IQ is approx:
a. 20 b. 30 c. .50 d.70 
d.50


You wish to assess the general cognitive reasoning ability of a newlyarrive dimmigrant who has no English lang skills. The most appropriate test to use for this person would be the
a. OtisLennon Mental Ability Test b. Slosson Intelligence Test c. MerrillPalmer Scale d. Ravens Progressive Matrices 
d. Ravens Progressive Matrices


Terman is best known for
a. developing culturefair intelligence tests b. adapting the binet scales for use with Englishspeaking children c. deriving stats to meas individual diff 
b. adapting the Binet intelligence scale for use with Englishspeaking children


Research with elementary school teachers suggests that, on the average,
a. they tend to pay more attention to boys than girls b. they tend to pay more attention to girls than boys c. male teachers tend to pay more attention to boys, while female teachers tend to pay more attention to girls d. they pay an equal amt of attention to girls and boys 
a. they tend to pay more attention to boys than girls


Research investigating the effectiveness of cooperative learning programs suggests that such programs are
a. effective for lowachievers but not high achievers b. effective only in racially homogenous classrooms c. effective for white but not AA children d. effective 
d. effective, regardless of the characteristics of classrooms or students


Which of the following is true of rapists?
a. Rapists typically achieve multiple orgasms during the rape b. rapists are typically sexually dysfunctional during the rape c. rapists use aggressive methods to satisfy sexual needs d. rape victims often report that rapists are strange and peculiar looking 
b. rapists are typically sexually dysfunctional during the rape


Which of the following statements is most true of longterm unemployment?
a. it is assoc with a variety of physical and psychological probs b. it is assoc with a varieyt of physical and psycho probs for men but not for women d. it has been shown to improve psycho functioning in some cases, because economic status is actually negatively corr with happiness and well being 
a. it is assoc with a variety of physical and psycho probs


A client who has been depressed for a long time appears to be improving. He mentions to you that he has thought about suicide. You should
a. have him involuntarily hospitalized b. make a no suicide contract with him c. find out whether he has a specific suicide plan 
c. find out whether he has a specific plan


Which of the following is the greatest indicator of suicide risk?
a. depression b. hopelessness c. ambivalence d. psychosis 
b. hopelessness


You develop a program for preschool children who are at risk for learning disabilitis. This is an example of
a. primary prevention b. secondary prevention c. tertiary prevention d. advocacy consultation 
a. primary prevention  administered before the onset of a prob and designed to prevent development.


You set up a program for inmates who are leaving prison, in order to lower their recidivism rates. This is an example of
a. primary prevention b. secondary prevention c. tertiary prevention d. advocacy consultation 
c. tertiary prevention  occurs after a problem has run its course  to prevent relapse


A teenage girl has made some halfhearted suicide attempts. Of the following, which intervention is most indicated for her?
a. psychotropic med b. discussing healthier ways to get attention c. interpretation of her wish to die as aggression turned inward d. involuntary hospitalization 
b. discussing healthier ways for her to get attention


Which of the following is the best example of consulteecentered case consultation?
a. A therapist seeks consultation bec he needs some advice regarding what specific techniques will work best with a severely depressed patient he is tx. b. a therapist seeks consultation to help prevent her own depression from impairing his effectiveness with patients c. a psych seeks consultation for advice on how to obtain funcing d. the parents of a group of learning disabled children seek consultation for some help obtaining govern support.... 
b. a therapist seeks consultation to help prevent his own depression from impairing his effectivenesss with patients


Which of the following is the best example of advocacy consultation?
a. a therapist seeks consultation bec he needs some advice regarding tx b. a therapist seeks consultation to help prevent his own depression from impairing... c. a psych seeks consultation for advice on how to obtian funding d. the parents of a group of learning disabled children seek consultation for some help obtaining govern support for special services for learning disabled children in the school. 
d.


The phenomenon of "theme interference" in an organization is most analogous to the phenomenon of _____ resistance
b. catharsis c. premature termination d. transference 
d. transference  Gerald Caplan uses the term to refer to a type of transference that may be a focus of consultation in organizations. It occurs when an unresolved conflict, related to life experiene or fantasy affects his or her perception or handling of a workrelated problem


When they seek psychotherapy, many physically abused women report that they do not want to leave their husbands. A common reason for this is that
a. the woman has an adequate support system b. The woman and her husband are in a honeymoon stage in which the husband aplogizes and promises to change d. the woman has a selfdefeating personality disorder 
b. honeymoon stage


In most states, the legal criteria used in determining whether or not a person can be involuntarily hospitalized has to do with
a. whether or not the person has a mental disorder b. the person's ability to regulate his or her own life d. danger to self or others 
duh


The duration of posttraumatic amnesia:
a. is unrelated to the severity of the injury b. is useful as an indicator of severity only when combined with the degree of retrograde amnesia c. is less accurate as an indicator of severity than the degree of retrograde amnesia d. is more accurate as an indicator of severity than the degree of retrograde amnesia 
d. is more accurate as an indicator of severity than the degree of retrograde amnesia


Disturbances in the ability to respond to stimulation on one side of the body is suggestive of damage to the
a. right parietal lobe b. right frontal lobe c. anterior cerebellum d. posterior pons 
a. right parietal lobe


A psychologist wants to determine the relationship between number of years as a sustance abnuser and number of weeks in an inpatient tx facility. The psych will use which of the following:
a. MANOVA b. ANOVA c. Pearson Product Moment Correlation Coefficient d. Kendall's Coefficient of concordance 
c. Pearson product Moment Corr Coeffficient


the psychologist in the above study plans to use the info she has collected to predict the number of aftercare sessions patient will require. The appropriate technique in this situation is:
a. regression analysis b. multiple regression analysis c. discriminant analysis d. canonical correlation 
b. multiple regression analysis


In primary school, ability tracking:
a. has beneficial effects on the achievement of high, mod, and low achieving students b. has no effect c. has deterimental effects d. has different effects on achievement for students of different ability levels 
d. has diff effects on achievement for students of diff ability levels


Research investigating fatherchild attachment suggests that it depends most on

play activities


Characteristics such as level of emotionality, activity, and sociability:
a. are evident in newborns and remain relatively stable in later years b. are evident in newborns but are not predictive of future bx c. begin to appear as traits after the first year of life d. do not become stable until the preschool years 
a. are evident in newborns and remain relatively stable in later years


Hans Eysenck's article about the effectiveness of psychotherapy reported ittle benefit of psych beyond what would be expected from spontaneous remission. Subsequent research has shown:
a. a different pattern  there are benefits to psychotherapy 
yeah


A depresssed patient is concerned that taking an antidepressant will produce sedation and interfere with his ability to perform his job and cause him to put on unwanted weight. What to do?

SSRI


According to Piaget, the source of motivation for cognitive development is:
a. social acceptance b. parental influence c. equilibration d. the collective unconscious 
c. equilibration


A therapist in NY decides to use cuento therapy in her work with elementary school children whose parents are Puerto Rican immigrants. She collects several Puerto Rican folktales and modifies them to make them apply to the innter city environ... this practice is:
a. suspect because b. a form of American ethnocentrism c. effective based on the research 
c. effective based on the research


Damage to the hippocampus is most likely to interfere with the ability to:
a. recall remote events b. recall events stored in recent long term memory c. transfer shortterm memory to longterm memory d. retrieve implicit memories 
c. transfer shortterm memory to longterm memory


The left hemisphere of the cerebral cortex is dominant for speech and language functions:
a. for most lefthanders but few righthanders b. for nearly all lefthanders and many right handers c. for nearly all right handers and the majority of left handers d. for nearly all right handers but a small minority of left handers 
c. for nearly all right handers and the majority of left handers


The symptoms of numbness, weakness, tremor, and ataxia that characterize multiple schlerosis are due to
a. lesions in the basal ganglia b. demyelination c. loss of ACh receptors d. cerebellar atrophy 
b. demyelination


In recent years, psychologists have attempted to become more sensitive to the uniqueness of each culture. This is most related to
a. emic approach b. etic approach c. eticemic synthesis d. neither 
a. emic approach


Cleo and Cleopatra obtain percentile ranks, respectively, of 48 and 92 on a math test. If four points is subtracted from each of their raw scores (due to scoring error) but not from the scores of the other examinees, you would expect:
a. Cleo's percentile rank will decrease more than Cleopatra's. b. Cleo's percentile rank will decrease less than Cleopatra's c. Cleo and Cleopatra's percentile ranks will decrease by the same amount d. Cleo and Cleopatra's percentile ranks will not change 
a. Cleo's percentile rank will decrease more than Cleopatra's
in a percentile rank districution, scores are evenly distributed throughout the distribution. Consequently, when converting raw scores to percentile ranks, small differences in the middle of the raw score distribution look larger in terms of percentile ranks than the same differences at the extremes of the distribution 

A test is likely to have adequate "floor" when it:
a. provides only two response options b. provides four response options c. contains a sufficient number of easy items d. contains a sufficient number of difficult items 
c. contains a sufficient number of easy items  needs to discriminate between people at the low end of the distribution.


Nathan obtained a grade equivalent score of 7.2 on Test A and a grade equivalent score of 6.2 on Test B. If Nathan's percentile rank on Test A is 65, you can conclude that his percentile rank on Test B:
a. is equal to 55 b. is less than 65 c. is less than 55 d. cannot be determined 
d. cannot be determined


A psychologist is hired by a large org to conduct a survey to determine people's feelings about its products, proposed changes to some of its products, and possible new marketing strategies. The psychologist decides to begin by mailing a survey to a random sample of consumers. What is the biggest threat to the internal validity of the study's results?

internal validity refers to the ability to derive valid, or accurate, info from a study. Common sense suggests that the biggest prob would be knowing whether or not the people who respond to a survey differ in any imp way from people who don't respond.
