Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
Research Design & Statistics, Test Construction

Research Design & Statistics, Test Construction

by hollybailey7, Nov. 2005

Subjects: construction design research statistics test

Favorite

Add to folder

Flag

Related Essays

Correlational Research Design Vs Experimental Research Essay
There are many research approaches available in psychology. Experimental and correlational research designs are just two of the various approaches taken into...
Text Decoding
To determine the research design that was used, I first had to determine the type of research that was conducted. Though it was identified in the article’s ...
Internal Validity In Research
When beginning to research, there is going to be three things that one wants to keep in mind so that they have some organization to their method. We have tal...
Two Types Of Quantitative Research Designs
Introduction Research is used to collect, analyze and interpret information systematically, to increase an understanding about a particular phenomenon (Shepe...
External Validity In Research
If research is seen as an argument then internal validity is about the logic and consistency of this argument. It is most clearly and narrowly defined in the...
Thiel Public Safety
(4) Chosen strategy, methods and techniques. (5) Specification of the measures that will be taken to ensure reliability and validity. Researchers should be p...
Standardized Test Hypothesis
Hypothesis High school students who read recreationally on a daily basis have higher standardized test scores. Overview of Experiment Volunteers will take a...
Nt1310 Unit 2 Journal
Were the measurements appropriate for the questions the researcher was approaching? n/a 14. Were the measures in this research clearly related to the varia...
Theoretical Framework In Research Theory
It discusses the variables included in the study and the exclusion of other variables which are expected to be included. 4. It indicates the...
I Should Do Aerobic Exercise Essay
Method Participants We collected data from a sample of undergraduate students at the University of Colorado at Boulder. The students were all in an upper d...

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/322

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

322 Cards in this Set

Front
Back

	A researcher is interested in determining the effects of a new behavior modification program and a new drug in increasing the vocabulary of mentally retarded children. The various groupsof subjects will receive one of four dosages of the drug (placebo, 10 mg, 20mg, and 30mg) and either the program or attention without the program. Afterwards, the WISC-III vocabulary subtest will be administered to subjects. Independent Variabiles(s) and levels: Dependent Variables(s):	IVs: Behavior modification (levels = program and attention only) and drug(levels = placebo, 10mg, 20mg, and 30mg) DV: Scores on the WISC-III vocabulary subtest
	A researcher wants to know if high school and college students differ in their attitudes toward affirmative action. He divides both groups of students into African-American and Caucasian groups and administers the measure to each of the four groups. Independent Variables(s) and levels: Dependent Variable(s):	IVs: School level (high school and college) and race (Caucasian and African-American) DV: Attitudes toward affirmative action
	A researcher wants to know if elderly individuals diagnosed with Alzheimer's disease differ from non-Alzheimer's elderly individuals in terms of a variety of physiological measures, including pulse rate, blood pressure, EEG patterns, kneww reflex, and white blood cell count. Independent Variable(s) and levels; Dependent Variables	IV: Diagnosis (Alzheimer's and no Alzheimer's) DVs: Pulse rate, blood pressure, EEG patterns, kneww reflext, and white blood cell count
	A therapist devises a new form of psychotherapy. She obtains separate samples of depressed, anxioux, psychotic, and personality-disordered clients. She then randomly assigns subjects into groups that will receive either her new form of therapy, traditional psychoanalysis, cognitive therapy, or humanistic therapy. Before and after therapy is completed, she obtasins the subjects' WAIS-III full-scale IQ scores, as well as their scores on the BDI, the BAI, and the MMPI=s's Paranoia and Schizophrenia Scales. She also believes that outcome depends on economic status, so she further divides subjects into high, medium, and low income groups. Independent Variable(s) and levels: Dependent Variable(s).	IVs: Pathology (depressed, anxious, psychotic, personality-disordered), therapy (new, psychoanalysis, cognitive, humanistic), and income (high, medium, and low) DVS: Wais full scale IQ scores,BDI, BAI, MMPI paranoia and Schizophrenia scale scores.
	What is the threat to internal validity that is most salient? In a research study testing the effects of a new strategy for increasing short-term memory, subjects have to wait three hours in a classroom before the study begins. As a result, they become very tired and cannot concentrate on learning teh strategy.	Maturation
	What is the threat to internal validity that is most salient? A drug designed to improve emotional and psychosocial functioning is administered to severely depressed individuals.	Testing
	A film designed to increase the racial awareness of white college students is shown the day after a leading civil rights activist spoke at the college.	history
	A new diet designed to help obese subjects lose weight is studied. Subjects are to be weighed before and after they go on the diet. Before they are weigned for the second time, the scale breaks.	instrumentation
	A study is conducted to test the effectiveness of Academic Review's workshops. Most of the subjects in teh study have taken the psychology licensing exam before.	testing
	In a study conducted to test the effeectiveness of a new reading strategy on reading comprehension, the first 20 subjects who sign up are assigned to the experimental group and the second 20 subjects who sign up are assigned to the control group.	selection
	Cues in the experimental setting that allow subjects to guess the research hypothesis.	demand characteristics
	The effect that an experimenter's expectancy has on the results of a research study.	Rosenthal Effect
	A procedure designed to ensure that all subjects in the population of interest have an equal chance of being chosen to partcipate in a research study.	random sampling
	The tendence of subjects' behavior to change due to the attention received in a research setting.	Hawthorne Effect
	A procedure designed to ensure that all subjects in a research study have an equal probability of ending up in each of the treatment groups.	Random assignment
	What is a statistical method of controlling for the effects of an extraneous variable?	ANCOVA
	What is a procedure that involves grouping subjects who are similar in terms of their status on an extraneous variable and then assigning the members of each group to different treatment groups?	matching
	What is a procedure in which a population is divided into "sub-populations" and all members of each sub-population have an equal probability of being chosen to participate in the research study?	stratified random sampling
	What is a study of the effects of aging on psychosocial adjustment that involves comparing older, middle-aged, and younger subjects at one point in time?	cross-sectional research
	What is a study of the effects of aging on psychosocial adjustment that involves comparing older, middle-aged, and younger subjects at different points in time?	cross-sequential research
	What is a single subject design in which the treatment is withdrawn to determine whether dependent variable scores revert to baseline levels?	reversal design
	What is an in-depth study of a single individual, institution, group, or phenomenon?	case study
	What involves manipulated variable(s) and random assignment?	true experimental research
	What involves manipulated variables and non-random assignment?	quasi-experimental research
	What is a study that involves obtaining dependent variable scores from a group of subjects on multiple occasions at regular intervals?	time-series design
	What is a study of a single conduct-disturbed adolescent in which the effectiveness of a new behavior modification program is assessed first at school, then at home, and then in the community?	Multiple baseline design
	What is a study assessing the association between SAT scores and college GPA?	correlational research
	What is a study of the effects of aging on IQ scores in which one group of subjects is examined for 30 years?	longitudinal research
	What is the greatest threat to validity when a mail survey is conducted? a. selection b. maturation c. randomization d. instrumentation	A. Selection - in a mail survey, subjects self-select themselves into the study; i.e., in deciding whether or not to mail the survey back, they decide who the study's participants will be. Thus, selection poses a threat to any mail survey's validity.
	Which of the following statements regarding the different types of developmental research (longitudinal, cross-sectional, cross-sequential) is most true? a. Cross-sequential and longitudinal studies are particularly vulnerable to "cohort" effects. b. Cross-sequential studies are more costly in terms of time and money to conduct than longitudinal studies. c. Of the three types of developmental studies, cross-sectional studies offer the greatest internal validity. d. By combining the methodology of cross-sectional and longitudinal studies, cross-sequential studies reduce many of the problems associated with both.	D. A cross-sequential study, like a cross-sectional study, involves studying groups of subjects that are divided on the basis of age. And, like a longitudinal study, it involves examining the subjects for a period of time, though this period is shorter in cross-sequential studies. Cross-sequential studies, because they involve studying subjects over time, control for "cohort" effects, which often confound cross-sectional studies. And, because theya re shorter than longitudinal studies, there is less cost in terms of time and money, and less subject drop-out.
	The defining feature of true experimental designs is a. random selection of subjects from the population b. random assignment of subjects into experimental groups. c. the use of manipulated variables d. the use of non-manipulated variables.	B. In a true experiment, variables are manipulated and subjects are randomly assigned to treatment groups. Choice C is incorrect because other types of designs (e.g., quasi-experiments) also use manipulated variables.
	Francis Galton's concept of regression to the mean is best expressed by which of the following statements? a. Individual variation within a species is unlimited. b. short fathers have tall sons. c. Short fathers have taller sons. d. Fathers and sons end up having similar heights.	C. Regression to the mean refers to the tendency of extreme observations to be less extreme upon re-testing or re-observation. Francis Galton applied this concept to heredity. He concluded that due to regression to the mean, individual variation in the species is limited (the opposite of choice A). That is, since extreme individuals will likely have less extreme offspring (e.g., short fathers are likely tohave taller sons), the characteristics of a species can only vary within a limited range. Note that you did not need to know anything about Galton to answer this question. You just needed to apply what you know about regression to the mean to a new situation.
	All of the following are true of multiple baseline designs, except a. a treatment is sequentially applied. b. they may serve as a substitute when the ABAB design is unethical. c. They may involve studying the same treatment for different behaviors, in different settings, or with different subjects. d. they involve the administration and then the withdrawal of a treatment.	D. A multiple baseline study is a single-subject study that involves the sequential application of a treatment across different baselines (i.e., behaviors, settings, or individual subjects). Unlike a reversal design, a multiple baseline design does not entail withdrawal of the treatment.
	The major threat to the internal validity of one-group time-series design is a. maturation b. regression to the mean c. history d. testing	C. A one-group time-series design involves administering multiple pretests and posttests to one group of subjects before and after a tx is administered. The design controls for many threats to internal validity, such as maturation, testing, and statistical regression. The major threat 2 its internal validity is hx, or an external event that occurs at right about the same time the tx is administered.
	A major advantage of case studies is that they a. can be used to identify variables for future research. b. involve the study of only one individual c. allow one to draw conclusions about the causal relationship between two or more variables d. permit the generalization of results to other cases	A. Although case studies cannot tenably be used to identify causal relationships between variables, they are often useful as pilot studies to identify variables and hypotheses for further investigation.
	A major disadvantage of case studies is that they a. can never be used to identify variables for future research b. involve the study of only one individual c. do not permit conclusions to be made about causal relationships between variables d. are frowned upon by journal editors, college professors, and other "scientifically correct" individuals.	C.do not permit conclusions to be made about the causal relationship between variables.
	Data collected in research studies can be classified into four types:	nominal, ordinal, interval, ratio
	A(n)_________ scale of measurement contains unordered categores; examples include gender, DSM dx, and haircolor.	nominal
	______ data is quantified into ordered categores; however, with such data, ir is impossible to determine the distance tetween data point. Examples include ranks and points on an attitude scale.	ordinal
	____ data are continuous data in which the distance between successive data points is equal across the scale. However, there is no absolute zero point; as a result, multiplication and division with such data are not possible. Examples include IQ scores and degrees Fahrenheit.	interval
	Finally, _____ data is the same as interval data except that it includes an absolute zero point and mult and division can be performed.	ratio
	In a ____ distribution, most observations (i.e., scores) fall in the middle of the distribution, with fewer and fewer cases as one moves farther away from the middle.	normal
	In a _________ distribution, most scores fall at the high end of the distribution, with a few extreme scores falling at the low end.	negatively skewed
	In a _______ distribution most scores are low and a few extreme scores are high.	positively skewed
	in a _______ distribution, the mode is higher than the median, which is higher than the mean.	negatively skewed
	In a _____ distribution, the mean is higher than the median, which is higher than the mode.	positively skewed
	In a _____ distrubition, the mean, the median, and mode are all equal.	normal
	The variance is a measure of the ______ of a distrubution.	variability (or dispersion, or spread)
	The standard deviation, a measure of the same property, is obtained by taking the ________ of the variance.	square root
	A Z-score is an individual score expressed in terms of standard deviation units above the mean. For example, in a distrubition with a mean of 80 and a standard deviation of 2, a score of 76 would be equivalent to a Z-score of ____, and a z-score of +3.0 would be equivalent to a raw score of ____.	-2.0; 86;
	The formula for a Z-score is ________, where X = ______, M = ________ and s.d. = _______.	(X-M)/s.d. X = raw score M = mean s.d. = standard deviation
	A percentile rank is a transformed score that reflects the percentage of scores falling ______ the corresponding raw score. For example, a PR of 80 is higher than ____% of the other scores in the distribution; it also could be said to be in the top ____%.	below 20% 20%
	By definition, percentile ranks have a _____ distribution; for example, in any distribution, the number of scores falling between the values of 10 and 20 is equivalent to the number of cases falling between 80 and 90.	flat (or rectangular)
	Therefore, almost all transformations of raw scores to percentile ranks would be termed ______ since they would involve a change of the original distribution's shape.	nonlinear
	In a normal distribution, approximately ____% of scores fall between the z-scores of +1.0 and -1.0.	68%
	About ___% of scores fall between the z-scores of -2.0 and +2.0.	95%;
	Say that 1,000 people take the WAIS-III, on which the mean IQ is 100 and the standard deviation is 15. About 680, or ___%, will obtain z-scores beteen ____ and ____; i.e., they will obtain IQs between ____ and ____. And about 950, or ____%, will obtain z-scores between ____ and ____; i.e., they will obtain IQs between ____ and ____.	68% -1.0 +1.0 85 and 115 95% -2.0, +2.0 70 and 130
	In a normal distribution, it is possible to determine the z-score equivalents of given percentile rank points. For example, a z-score of +1.0 is equivalent to a percentile rank of about _____, and a percentile rank of 98 is appriximately equivalent to a z-score of _____.	84; +2.0
	If you had a test with a mean of 25 and a standard deviation of 5, you would set the cutoff score at ____ if you wanted to select the top 16% of examinees and at ____ if you wanted to select the top 2% of examinees.	30 35
	A person receives a score of 90 on a test with a mean of 100 and a standard deviation of 5. The corresponding z-score is ____, the corresponding T-score is _____, and the corresponding stanine score is approximately ____.	-2.0 30 1 2
	In addition, if the distrubution is normal, we would know that the corresponding percentile rank is around ____.	2 2%
	If the score were converted to a WAIS-III IQ score (mean = 100, s.d. = 15), the new transformed score would be ____. And if the score were converted to an ETS score (i.e., SAT and GRE score, mean = 500 and s.d. = 100), the new transformed score would be ____. see page 60,vol 6/research design & Statistics	70 300
	If you convert raw scores to z-scores you would be conducting a a. linear transformation because the shampe of the distribution changes b. linear transformation because the shape of the distribution does not change	b. When raw scores are converted to z-scores, the shape of the distribution does not change. For instance, if the distribution of raw scores is normal, the distribution of the corresponding z-scores will also be normal. When transformed scores retain the same shape as the original distribution, the transformation is said to be "linear."
	Eight students take a math test and obtain the following scores: 80, 53, 39, 32, 45, 72, 28, 49. The median score of this distribution is:	To answer thie question, first arrange the numbers in numerical order: 28, 32, 39, 45, 49, 53, 72, 80. To obtain the median, you must take the mean of the two middles scores (45 and 49), which is 47.
	One thousand people take a job selection test that has a mean of 60 and a standard deviation of 5. An industrial psychologist wants to select the top 150 scorers. Assuming a normal distribution of scores, she would set the cutoff score at approximately:	First, you have to recognize that the "top 150" is equivalent to the top 15% (150/1,000 = 15/100 = 15%). Then, you have to remember that, in a normal distribution, 16% of all scores will fall at or above a z-score of +1.0. Finally, you have to convert the raw score in the question to a z-score of +1.0. In this case, a score of 65 is one standard deviation above the mean and therefore is equivalent to a z-score of +1.0. You might have been thrown by the fact that you were looking for the top 15% even though the standard deviation curve only allows you to identify the cutoff score for the top 16%. If so, you might remind yourself at this point to work on the exam with rounded-off numbers (actually, you'll have no choice, since no calculators will be allowed). Fifteen percent is close enough to 16^ for you to use the standard deviation curve.
	Judy and Johnny are students in a school district that is administered a standardized mathematics test. Judy scores in the 48th percentile on the test, while Johnny score in the 93rd percentile. Scores on the test are normally distributed. A few weeks after the scores are reported, a scoring error is discovered, and as a result, three points are added to both Judy's and Johnny's raw score. No changes are made to the score of any other students in the district. Given these facts, which of the following statements is true? a. Judy's and Johnny's percentile ranks will increase by the same amount. b. Judy's percentile rank will increase more than Johnny's. c. Johnny's percentile rank will increase more than Judy's. d. Neither Johnny's nor Judy's percentile rank will change.	The answer to this question is related to the fact that, in a normal distribution, there are more scores in the middle of the distribution that at either extreme. As a result, the percentile rank range in the middle of the distribution is much wider than it is at either end of the distribution. Thus, any change to a raw score in the middle of the distribution results in a greater percentile rank change that the same raw score change at the distribution's extremes. In this case, Judy originally scored at the 47th percentile, which is near the middle of the distribution, while Johnny scored at the 93rd percentile, or at the high end of the distribution. Therefore, adding three points to their raw scores willresult in a greater increase in Judy's percentile rank than in Johnny's - due to the change, Judy will "jump over" a greater percentage of other students than will Johnny.
	The deviation of a sample statistic from a parameter of the population from which the sample was drawn?	sampling error
	The probability of rejecting a true null hypothesis?	alpha
	The probability of retaining a false null hypothesis?	beta
	The probability of rejecting a false null hypothesis?	power
	A researcher hypothesizes that students who sleep with their textbooks under their pillow score higher on the GRE than students who don't. He obtains a sample of 20 students and assigns 10 to the "books under pillow" group and 10 to the "no books under pillow" group. He concludes, on the basis of a statistical test, that his hypothesis was correct. In the population, however, there is no diff on the GRE between groups. What kind of error did he make?	Type 1 Error
	A researcher hypothesizes that cog therapy is superior to other forms of therapy in the tx of anxiety. She fails to find any evidence that cog therapy is superior. However, in reality, cog therapy is the superior tx. What error did she make?	Type II Error - accepted a false null hypothesis
	In statistical hypothesis testing, because we cannot study the entire population, sample values are used to estimate population vales (a value obtained from a sample is referred to as a(n) _____, while a value obtained from a population is referred to as a(n) _____).	statistic; parameter
	The discrepancy between a sample value and the corresponding population value is referred to as ______.	sampling error
	The mean is one example of a population value that is estimated on the basis of sample data. The expected discrepancy between a sample mean and a population mean is referred to as the _____.	standard error of the mean
	The formula for the standard error of the mean is s.d./square root of N, where s.d. equals ____ and N equals ____.	standard dev sample size
	The ______ hypothesis of most research studies posits that there is no relationship between the independent variable(s) and the dependent variable(s).	null
	The null hypothesis is usually stated in terms of population ____; an example would be "the mean of population A is equal to the mean of population B."	parameters
	The ______hypothesis usually posits that there is a relationship between the independent variables and the dependent variables. This hypothesis can either be _______ (e.g., one pop mean is diff from the other) or ______ (e.g., one pop mean is greater than the other).	alternative nondirectional directional
	In statistical decision-making, four outcomes are possible: two are correct decisions, and two are errors. One type of correct decision would be to ____ a true null hypothesis. A second type of correct decision, the goal of research, would be to ___ a false null hypothesis. The probability of making the latter correct decision is referred to as ____.	retain reject power
	One of the incorrect decisions would be to retain a ______ null hypothesis. This is referred to as a(n) ____ error, and the probability of making it is known as ______.	falst Type II beta
	The other incorrect decision would be to reject a(n) _____ null hypothesis. This is referred to as a(n) _____ error and the probability of making it is known as ____	true Type I alpha
	_____________statistical tests are used to test statistical hypotheses when the dependent variable is measured on an interval or ratio scale. Such tests make two assumptions: 1)_____ and 2) ______________.	parametric normal distribution of data homogeneity of variance
	Methods designed to test statistical hypotheses when the dependent variable is measured on a nominal or ordinal scalre are referred to as ________. These tests don't make the same assumptions as _____ tests. However, tests in both categories do assume that the sample is _____ of the _____ from which it was obtained.	nonparametric parametric representative population
	In a research study with 400 subjects, the standard deviation of scores on the dependent variable is 20. In this case, the standard error of the mean is:	B. The standard error of the mean is equal to the standard deviation divided by teh square root of the sample size. The square root of 400 is 20. Thus the standard error of the mean in this case is 20/20, or 1.
	The standard error of the mean is a. directly proportional to the standard deviation and inversely proportional to the sample size. b. directly proportional to the standard deviation and directly proportional to the sample size. c. inversely proportional to the standard deviation and directly proportional to the sample size. d. inversely proportional to the standard deviation and inversely proportional to the sample size.	A. As the population standard deviation increases, the standard error of the mean increases; in other words, the standard error of the mean is directly related (i.e., directly proportional) to the standard deviation. And as the sample size increases, the standard error of the mean decreases; in other words, sample size and the standard error of the mean are inversely related.
	Which of the following assumptions is shared by both parametric and nonparametric tests? a. normal distribution of data b. homogeneity of variance c. random assignment of subjects to experimental groups d. random selection of subjects from the population	d. Both parametric and nonparametric tests are inferential statistical methods. This means that they are used to draw conclusions about a population on the basis of information derived from a sample. For these conclusions to be unbiased and accurate, a sample must be representative of the population from which it is drawn. The best way to ensure that a sample is representative is to randomly select subjects from the population of interest.
	When a statistical test lacks power, this means that a. the prob of making a TYpe I error will be high. b. the prob of making a Type II error will be low. c. The prob of obtaining statistical significance will be low. d. the prob of getting one's research....	c. When a statistical test lacks power, this means that there is a high prob of a Type II error, or that a false null hypothesis will be retained; i.e., the test will be unable to detect a true effect of an independent variable on a dependent variable. Put another way, the test will not yeild statistical significance (a finding of an effect) when it should.
	Alpha can be defined as a. the prob of rejecting the null hypothesis when the hull hypothesis is true. b. the prob of retaining the null hypothesis when the null hypothesis is true. c. the prob of rejecting the null hypothesis when the null hypothesis is false. d. the prob of retaining the null hypothesis when the null hypothesis is false.	A. Alpha is the prob of making a Type I error, which is defined by choice A. In the Eng lang, this means that alpha is the prob that a statistical test will falsely tell you that your independent variable has an effect, when, in the population, it does not.
	Which of the following would have the least meaning? a. retaining the null hypothesis when power is low. b. rejecting the null hypothesis when power is low. c. retaining the null hypothesis when power is high. d. rejecting the null hypothesis when power is low.	A. When power is low, a statistical test is unlikely to detect an effect of an independent variable, even when one is present in the pop. In other words, the null hypothesis (the hypothesis of no effect) is likely to be retained. In such cases, when you retain the null, it does not necessarily mean that you have done so corerectly; it could just be that the test lacked the power to correctly reject the null (i.e., to detect a true effect). So retaining the null with low power doesn't really tell you anything.
	Subjects take the BDI before and after a six week trial period on the drug?	t-test for correlated samples
	Instead of taking the BDI, subjects are either classified by raters as "treatment successes" or "tx Failures."	chi-square
	For control & experimental subjects, score on the MMPI's depression scale are obtained in addition 2 those from the BDI. Stat test?	chi-square
	Subjects r randomly assigned 2 either the control (no-drug) or the experimental (drog) group. Stat test?	ANCOVA
	The researcher is interested in deptermining if the effects of the drug r different at diff levels of symptom severity (highly depressed, mod depressed, and not depressed). Stat test?	factorial ANOVA
	Subjects r randomly assigned to either the control or the experimental group & scores on the BDI are converted to ranks. Stat test?	Man-Whitney U
	The mean score of subjects who take the drug is compared to the pop mean for depressives on the BDI. Stat test?	t-test for single sample
	Subjects r assigned to 1 of 4 groups: high dosage, mod dosage, low dosage, and control. Stat test?	one-way ANOVA
	Scores on the BDI r adjusted so that variability accounted 4 by the subjects' scores on a test of self-esteem is removed.	ANCOVA
	The magnitude of the F-ratio for a one-way ANOVA depends on the ratio between 2 sources of variance in a set of dependent variable scores. If _____ variance significantly exceeds ______ variance, then the F ratio will be high and the null hypothesis will be (rejected/retained).	between group within group rejected
	If the ______ variance equals or exceeds _____ variance, then the F ratio will be low, and the null hypothesis will be (rejected/retained). The F-ratio is a fraction with _____, a measure of ________ variance in the numerator. And this fraction has _____, a measure of _____ variance, in the denominator.	within group between group retained MSB between group MSW within group
	In studies with more than one independent variable, a(n) _____ effect occurs when the effects of one independent variable do not generalize to all the _____ of one of the other independent variables. A _____ ANOVA provies an indication of the strength of this effect. If this effect is present, ____ effects must be interpreted with caution.	interaction levels factorial main
	The nonparametric alternative to a t-test for independent samples is the a. Kruskal-Wallis B. Wilcoxon matched paird. c. Mann-Whitney U d. t-test for correlated samples	Mann-Whitney - When a study involves a comparison of two independent groups and interval or ratio data, the t-test for independent samples would be used to compare the means of the two groups. If the assumptions of a parametric test are violated, the data would be converted to ranks and the Mann-Whitney would be used. Mann-Whitney U is the nonparametric alternative to teh t-test for independent samples.
	An advantage of using a MONOVA instead of multiple one-way ANOVAs is that a. a MANOVA is computationally simpler b. Multiple ANOVAs cannot be used when a study involves more than one dependent variable c. the probability of making a Type II error is reduced d. the probability of mkaing a Type I error is reduced.	B. This study has one independent variable (training) with more than two levels (teacher training, computer training, no training). Thus the appropriate statistical test is the one-way ANOVA.
	The use of which of the following post-hoc tests results in the greatest probability of making a Type II error? a. Tukey b. Scheffe c. Fisher's LSD d. Neuman-Keuls	Scheffe - Of all the posthoc tests, teh Scheffe is the most conservative, which means that it provides the greatest protection against a Type I error. However, since there is a trade-off between Type I and Type II errors, this also means that its use results in the greatest probability of making a Type II error (i.e., missing an effect).
	When a factorial ANOVA yields a significant main effect and a significant interaction effect, a. the main effect should be ignored b. the main effect should b interpreted in light of the interaction effect c. the interaction effect should be ignored d. the interaction effect should be interpreted with caution	B. Whenever both a main effect and an interaction effect exist, the main effect must be interpreted in light of the interaction effect. This is because the interaction means that the main effect does not hold true in all cases (i.e., at all levels of another independent variable).
	A researcher is interested in the correlation between gener and homeownership - stat corr?	phi coefficient
	A researcher is interested in the correlation between gener and scores on the BDI?	point-biserial coefficient
	a researcher is interested in the correlation between scores on the BDI and IQ scores on the WAIS. Scores of 20 or above on the Beck are reported as "depressed," whereas scores below 20 are reported as "not depressed."	biserial coefficient
	A researcher is interested in the correlation between DSM diagnostic category and political party.	contingency coefficient
	A researcher is interested in the correlation between motivation and scores on a prof licensing exam. She wishes to statistically remove the effects of IQ on this relationship.	partial correlation
	A procedure designed to assess the causal interrelationships among three or more variables.	path analysis
	A researcher is interested in using annual income in dollars to peduct scores on a measure of happiness in the elderly.	simple regression
	A researcher is interested in using income, an index of support system adequacy, and an index of overall health to predict scores on a measure of happiness in the elderly.	multiple regression
	A researcher is interested in the degree to which the combination of income, scores on an index of support system adequacy, and scores on an index of overall health is related to the combination fo scores on three measure of happiness.	canonical correlation
	A personnel department will reject all applicants who do not demonstrate a minimum level of proficiency of five tests of aptitude.	multiple cutoff
	A gambler is interested in the correlation between racehorses' finishes in their first and seconf races.	Spearman's rho
	The term "least squares criterion" describes the principle that underlies a. calculating a Pearson r correlation coefficient. b. constructing a regression line. c. determining whether multicollinearity in a multiple regression equation is significant d. conducting statistical hypothesis testing with the Pearson r.	B. The regression line is placed at a location in the scattergram that ersults in the lowest possible sum of squared deviations of points from the line. This principle is known as the "least square criterion."
	A researcher is interested in the correlation between scores on a standardized intelligence test and elementary school grades. For her research, she has access to students in a local elementary school. To obtain the highest possible correlation coefficient, she would be best advised to use a. only high scorers on the intelligence test b. only students who score in the middle ranges on the intelligence test c. a random sample of students d. only students who are highly motivated to do their best on the IQ test	C. A correlation coefficient will be lowered if one uses only a restricted range of scores on any of the variables involved. In other words, it is best to utilize the full range of scores, which can be obtained from a random sample of students.
	When using multiple regression, a researcher would be best advised to choose predictors that a. high a high correlation with each other and a high correlation with the criterion. b. have a low correlation with each other and a low correlation with the criterion. c. have a high correlation with each other and a low corelation with the criterion. d. have a low correlation with each other and a high correlation with the criterion.	D. In a multiple regression equation, a migh correlation between the predictors and the criterion is necessary; otherwise, it would be impossible to use the predictors to estimate scores on the criterion. And low intercorrelations among predictors are desirable, so that the predictors are not providing redundant information.
	A researcher is interested in the relationship between three predictors and a criterion. One of the predictors has a correlation of .55 with the criterion. which of the following statements is true of the multiple correlation coefficnet (multiple R) for the relationship between the three predictos and the criterion? a. The multiple R cannot be lower than .55. Due to multicollinearity, it is possible that the multiple R is lower than .55. c. The multiple R will be higher if all the predictors are highly correlated with each other. d. The multiple R is totally unrelated to teh correlation between individual predictors and the criterion; thus, it is impossible to have any idea what the multiple R will be.	A. A multiple correlation coefficient can be no lower than any of the individual correlations between a predictor in the equation and the criterion.
	The correlation between psychosis and IQ scores would best be assessed using which of the following corrrelation coeficients?	B. To measure the correlation between an artificialdichotomy and a variable measured with interval or ratio data, one would use the biserial correlation coefficient.
	Test A has a correlation of .60 with Test B and a correlation of .30 with Test C. Test A accounts for ____ as much variability in Test B as it does in Test C. a. twice b. three times c. four times d. eight times	To determine how much variability in one measure is explained by variability in another, one squares the correlation coefficnet. The square of .60 is .36, and .30 squared is .09. C is therefore correct because .36 is four times greater than .09.
	Which of the following describes a correlation of 0.0 between "x" and "Y"? a. The variability of Y scores at each X value is lower than the total variability of Y. b. The variability of Y scores is diff at diff levels of X. c. The variability of Y scores at each X value is equal to the total variability of Y scores d. The variability of Y scores at each X value is approximately the same.	C. To answer this question, u have to look closely at the wording of ea choice and translate ea into everyday English. What C is saying is that the range of "Y" at every individual "X" score will be equal to the entire range of "Y." For example, let's say that scores on both "X" and "Y" can range from 1 to 10. Say that people who get a score of 1 on "X" score anywhere from 1 to 10 on "Y." And those who get a score of 2 on "X" score anywhere from 1 to 10 on "Y." And so on, for all values of "X." One's score on "X" doesn't provide any info about Y, which means that the correlation is O. If you go through the other choices and try to make sense out of them, Choice A is the converse of choice C and therefore describes a correlation that is greater than O. Choice B describes heteroscedasticity, and choice D describes homoscedasticity.
	According to the central limit theorem, a. As sample size inc, the shape of the samplind dist of means will appropach a normal shape only if the underlying pop dist is normal b. as sample size inc, the shape of a sampling dist of means will assume a normal shape, regardless of the shape of the underlying pop dist c. there will be more variability in a sampling dist of means than there will be in the underlying pop dist d. the mean of the sampling dist of means will always be an underestimate of the actual pop mean.	B. According to the central limit theorem, the shape of a sampling distribution of means will approach normality as sample size increases. This is true regardless of the shape of the dist of the value in the underlying pop.
	The standard deviation of the sampling dist of means is also known as a. the standard error of estimate b. the standard error of measurement c. the standard error of the mean d. the standard error of the day	c. This is the definition of the standard error of the mean.
	A difference between meta-analysis and a literature review is that a. meta-analysis involves calculation of an "effect size." b. a lit review is likely to include fewer studies than a meta-analysis c. a lit review has less ecological validity than a meta-analysis d. a meta-analysis involves a review of many diff studies in one topic area	A. Unlike a traditional lit review, a meta-analysis involves calculation of an effect size. This allows one to estimate the overall effects, across many studies, of a particular tx or independent var.
	A one-way ANOVA would be most robust if a. the shape of the underlying pop data is skewed b. there are many levels of the independent variable c. sample size is small d. sample size is large	D. A stat test is said to be robuse when its results tend to be accurate even in the face of mod violations of its assumptions about the pop data. The larger the sample size, the more robust stat tests tend to be, especially with regard to the normal dist of data assumption.
	A researcher conducts a study using a time-series design, consisting of a pretest phase, in which the same test is administered five time; a treatment; and a posttest phase, in which the test is admin five more times. The researcher analyzes his results by conducting a t-test comparing the combined means of the pretest and posttest phases. The main prob with this design is a. t-tests cannot be used to compare two means b. the t-test will not be powerful enough c. autocorrelation d. there is no good reason to administer the same test five times in each phase of measurement.	C. Due to autocorrelation, standard parametric tests such as the t-test cannot be used in the analysis of time series data. Instead, one must use special techniques designed for the purpose of time-series analysis.
	In a normal dist of scores, a T-score of 60 is approx equal to a percentile rank of a. 60 b. 68 c. 84 d. 95	C. T is a standard score with a mean of 50 and a standard deviation of 10. Thus, a T-score of 60 is equal to 1 s.d. above the mean. In a normal dist, this is equivalent to the 84th percentile.
	The results of an experiment indicated no significant differences at the .05 level. This means that a. the null hypothesis is not rejected b. the null hypothesis is rejected c. the alternative hypothesis is accepted d. there was an error in calculations	A. When the results are not significant, you do not reject the null hypothesis. That is, you cannot conclude that the IV had an effect.
	3. Assuming a norm dist, how many people would score between 400 and 600 on a standardized test with a mean of 500 and a standard deviation of 100 (N=1000)?	B. First convert the scores to standard deviation units (ie, Z scores). A score of 400 is equivalent here to -1z, and a score of 600 equals +1z. Then, remember that 68% of cases fall between -1z and +1z in a norm dist. Finally, take 68% of 1,000, which is 680.
	4. Which of the following correlations is the highest? a. _.50 b. .05 c. .41 d. .23	A. When determining which correlation is larger, you ignore the sign and just look for the bigger number
	5. If two variables are positively correlated, this means that a. as one goes up the other goes down. b. as one goes up the other stays the same c. as one goes up the other goes up d. their means are equal	C. A positive correlation between two variables means that both move in the same direction
	5. In the F ratio, within-group variance, as measured by MSW, reflects a. variance accounted for by random and irrelevant factors b. the difference between the sample and the population means c. variance due to the effect of the independent variable d. effect of the tx on the pop means	A. In the F ratio, within-group variance is error variance (in face, MSW, the index of within-group variance, is sometimes referred to as the "error term"). This means that it measures variability due to irrelevant random factors such as pre-existing individual differences between subjects.
	7. For a given pop, which of the following score distributions will ikely have the least variability? a. the pop dist b. a dist of a sample of means from the pop c. a dist of a sample of 10 scores from the pop d. a dist of a sample of 20 scores from the pop	B. A sample of means from a population always has less variability than the pop or any one individual sample does. This is illustrated pictorially int he Appendix on Advanced Statistics.
	7. For a given pop, which of the following score dist will ikely have the least variability? a. the pop of a dist b. a dist of a sample of means from the pop c. a distribution of a sample of 10 scores from the pop d. a dist of a sample of 20 scores from the pop	B. A sample of means from a pop always has less variability than the pop or any one individual sample.
	8. The statement most true of nonparametric tests is that they a. require data scaled on an interval or ratio basis b. are more powerful than parametric tests c. rely on pop parameters to draw conclusions about sample stats d. are used when one is not sure of the shape of the dist	D. Unlike parametric tests, the use of nonparametric tests does not require any assumptions about the shape of the pop dist
	10. In a study in which a one-way ANOVA is used, the null hypothesis would be that a. sample variances are equal b. pop variances are equal c. sample means are equal d. pop means are equal	D. An ANOVA is designed to test the hypothesis that group means were drawn from the sa pop; i.e., that means are equal in the pop.
	An experimenter is testing the hhpothesis that there is no diff between teaching methods in regards to the grades obtained by the students on an arithmetic test. His design calls for two groups - trad teaching method vs programmed self-instruction. He uses a t-test to analyze the data. The results are: Group 1 mean = 86; Group 2 mean = 73. The t value exceeds the tabled critical value at the .01 level for a 2-tailed test. He should a. accept the null and conclude the alternative hypothesis is false. b. reject the null and conclude the alternative hypothesis is true. c. retain the null and conclude that the alternative hypothesis is true. d. not make any interpretation, since the researcher should have used a one-tailed test.	B. If the results are significant at teh .01 level, then you reject the null and conclude that the alternative is true.
	12. If a sample of 400 is taken from a pop, and you find that the mean of this ample on some standard test is 50 and the standard deviation is 10, the standard error of the mean would be a. 20.0 b. 10.0 c. .50 d. 5.0	C. To get this one correct, you'll need to know the formula for the standard error of the mean. The standard error of the mean equals the standard deviation of the sample divided by the square root of sample size. In this case you'd take 10 (the standard dev) and divide it by 20, and you'd get the answer of .50.
	13. All of the following are true of path analysis, except a. an a priori path is drawn connecting two or more variables in a causative direction. b. teh magnitude of the relationship between variables is determined by thier correlation coefficients c. multiple causation can be considered. d. variables are manipulated in order to confirm the direction of the path of causation.	D. Path analysis is a method designed to determine or confirm causative relationships among variables via correlations. Hence, you wouldn't actually manipulate variables; you'd only measure their degree of relationship.
	14. In a normal dist of scores, the number of cases falling between a percentile rank of 11 and 20 will be _____ the number of cases falling between a percentile rank of 41 and 50.	A. The distribution of percentile ranks is, by definition, flat. This means that the sa number of scores will fall between equal intervals. In this case, 10% of scores will fall within the ranges identified.
	15. The phenomenon whereby an experimenter's expectancies influence subjects' responses on a dependent variable in the direction predicted is known as a. the hawthorne effect b. demand char c. the carryover effect d. the Rosenthal effect	D. It's called the Rosenthal effect bec it was first reported by Robert Rosenthal.
	16. In a study that invludes one group that is tested on an intervally-scaled dependent variable before and after it receives tx, what stat test would be used to compare the obtained means? a. t-test for single sample b. t-test for correlated samples c. t-test for independent samples d. two-way ANOVE	B. To compare two means obtained by correlated samples (e.g., the same grou) one would use the t-test for correlated samples.
	If a study such as the one described had 40 subjects, degrees of freedom would be equal to a. 19 b. 38 c. 39 d. 78	C. In the t-test for correlated samples, the degrees of freedom equal N-1. Since there are 40 subjects, there will be40 pairs before & after of scores.
	A one-group pretest/posttest design is susceptible to many threats to internal validity, including... a. hx b. maturation c. statistical ergresion d. all of the above	hx, maturation, statistical regression
	19. Why might the use of a factorial ANOVA be preferred over the use of separate one-way ANOVAs? a. The use of a factorial ANOVA reduces the prob of making a TYPE II error. B. A factorial ANOVA allows one to assess for interaction effects c. A factorial ANOVA can be used when the data is interval or ratio. d. A factorial ANOVA can be used when the study involves more than two independent variables.	B. If you have multiple independent variables, you can use either mult one-way ANOVAs or one factorial ANOVA. An advantage of the latter is that it allows you to measure interaction effects.
	20. A mall owner is interested in determining whether shoppers are equally likely to use the east, north, south, and west entrances to the mall. Which of the following stat tests would be most helpful? a. chi-square b. one-way ANOVA c. factorial ANOVA d. Kruskal-Wallis	A. In this case, the data will consist of frequency of observations within categories. The Chi-square
	21. If the mall owner in the above question sampled 100 customers, the expected frequency in each cell under the null hypothesis would be a. 20 b. 25 c. 50 d. more than 25 but less than 50.	B. If the null hypothesis is true, the four entrances are used with equal freq. Thus, if 100 customers are sampled, 25 would be expected to use ea entrance.
	A job applicant takes five tests. His performance is considered excellent on four of the tests but slightly inadequate on the fifth. If the procedure known as multiple cutoff were used to make hiring decisions, this company would a. place the app in a training prog b. retest c. hire d. not hire	D. When the mult cutoff procedure is used, an examinee must demonstrate the minimum level of proficiency on all the predictors that are administered. He is not selected.
	A researcher is int in the assoc between IQ and happiness. He uses mult measures of both of these attributes. What stat analysis is the researcher likely to use? a. mult regression b. path analysis c. canonical corr d. partial corr	C. Canonical correlation is the appropriate stat method to correlate multiple predictors with multiple criterion measures.
	24. In a study involving three groups, the variability in scores of the groups differs. The robustness of the parametric stat test used to analyze the data from this study would be enhanced if a. alpha is set at a high level b. the grp with the most variability also has the most subjects c. the three groups have an equal sample size d. the researcher transforms data to equalize the variability of scores	C. A stat test is said to be robust when its results tend to be accurate even in the face of moderate violations of its assumptins about the pop data. In this case, the homogeneity of variance assumption is violated. When violated, the stat test tends to remain tobust as long as the groups' sample sizes are equal.
	25. All of the following statements are true of forward stepwise multiple regresion analysis, except: a. the technique is useful in dealing with the prob of redundancy in a set of predictors b. the technique allows a researcher to add predictors one at a time until the ideal set of predictors are determined. c. the use of the procedure involves "ordering" predictors based on how much each predictor increases the multiple correlation coefficient d. the predictor with the lowest correlation with the criterion is the first one retained for the final mult regression equation.	D. Forward stepwise regression is a technique that allows a researcher to choose a smaller set of predictors out of a larger subset. When the technique is used, the predictor with the highest correlation with the criterion is the first one retained for the final equation. Choices A, B, and C are true statements about forward stepwise regression.
	26. Bayes' theorem is associated with a. sample size and inferential stats b. conditional prob and base rates c. the normality assumption in the central limit theorem d. meta-analysis and effect sizes	Bayes' theorem is used to revise conditional probabilities based on base rates - B
	27. A t-score of 70 corresponds to a. the 70th percentile b. the 90th percentile c. the 98th percentile d. 3 standard deviations above the mean	c. A T-score of 70 is two standard deviations above the mean (the mean of a T-score distribution is 50; standard deviation is 10). When any score is two standard deviations above the mean, 98 percent of the dist is below that score. In this case, 98 percent of the scores is below a T-score of 70, in other words, the 98th percentile.
	1. A psychological tst can be devined as a(n) _____ and _____ measure of behavior.	objective standardized
	The process of _____ involves ensuring uniformity of administration and scoring of the test. This proces includes obtaining ______, which represent the score of a larger representative sample of the pop for which the test is intended.	standardization norms norms
	Interpreting a test score by comparing it to _____ allows us to determine how a given score by comparing it to others of the same pop who have taken the test.	norms
	A good test will be ______, which means that it will provide repeatable, consistent results. It will also be _____, which means that it will measure what it purports to measure.	reliable; valid
	A(n) _____ test is one in which the examinee's response rate is assessed. A(n) _____ test is one that assesses the level of difficulty an examinee can attain.	speed; power
	A9n) _____ test uses the examinee himself as the frame of reference in score interpretation. It indicates which attributes are weakest and strongest within the individual.	ipsative
	A9n) ____ effect occurs when a test is unusually difficult, and many test-takers score at or near the bottom of the scale.	floor
	The defining characteristic of an objective test is a. the existence of norms b. a standardized set of scoring and administration procedures c. examiner discretion in scoring and interpreting items. d. reliability and validity	B. An objective test is one that is independent of the subjective judgment of the particular examiner. This means that administration and scoring procedures are uniform, or the same for all examiners.
	A test developer administers an intelligence test to a group of examinees on oct. lst and then administers the same test to the same group of examinees on nov. lst. Most likely the examiner is interested in a. assessing the test's reliability. b. assessing the test's validity. c. determining whether or not the test is vulnerable to the effects of reponse sets d. double filling his funding source.	A A test is reliable if it provides repeatable, consistent results. Giving the sa test to the sa group of examinees at diff points in times is one way to assess a test's reliability.
	A drawback of norm-referenced interpretation is that a. a person's performance is compared to the performance of other examinees b. it does not permit comparisons of individual examinees' score on diff tests c. it does not indicate where the examinee stands in relation to others of the sa pop d. it does not provide absolute standards of performance.	D. norm-referenced interpretation involves comparing an examinee's score to the scores of others who have taken the same test. A drawback of this type of interpretation is that it does not provide abosolute standards of good or poor performance - the examinee's score must be interpreted in light of the performance of the norm group as a whole.
	a. According to classical test theory, an examinee's obtained test score consists of two components: ______, or the portion of variability among examinees that is due to whatever attribute is being measured by the test, and ____, or the portion of variance due to factors that are irrlevant to whatever is being measured.	truth (or true score vaiance) error (or measurement error, or error variance)
	______ by definition, is _________, which means that it is due to factors that affect different examinees in different ways.	error; random
	If a test is ______, it will be free from ______ and yield information about examinees' _____.	reliable error true scores
	2. The reliability coefficient, unlike other correlation coefficients, is interpreted ______. This means, e.g., that for a test with a reliability coefficient of .70, _____% of observed score variance is true variance. In other words, unlike as with other correlation coefficients, you never ______ the reliability coefficnet in order to interpret it.	directly 70 square
	3. Obtaining a(n) ____ reliability corefficient involves administering the same test to the same group of people, and then correlating scores on the first and second administrations.	test-retest
	The sources of measurement error for this type of reliability include factors related to _____.	the passage of time
	This coefficient is not appropriate to use for test that measure ______ and those on which scores are affected by ______.	unstable attributes repeated administration
	4. Obtaining a(n) ______ reliability coefficient involves administering two forms of a test to the same group of examinnes, and then obtaining the correlation between the two sets of scores. Sources of measurement error for this reliability coefficnet usually include factors related to both _____ and different _____ on the two forms.	alternate forms the passage of time content
	5. There are a number of measures of _____ reliability, all of which indicate the magnitude of correlation among individual items.	internal consistence
	For instance, obtaining a(n) ____ reliability coefficient involves dividing a test in two and obtaining a correlation between the halves as if they were two shorter tests. When this coefficient is used, the ____ is usually used to correct for the effects of shortening the test on the reliability coefficient. There are also two measure of the average degree of inter-item consistency: the ________, which is used when items are dichotomously scored, and ______, which is used when items are not dichotomously scored. All three of these coefficients are inappropriate for _____.	split-half spearman-Brown Formula; Kuder-Richardson Formula coefficient alpha speed tests
	The _____ is used to construct confidence intervals that indicate the range in which an examinee's _____ test score is likely to fall, given his _____.	standard error of measurement true obtained test score
	For example, there is a _____% probability that the examinee's _____ score lies within one ______ of the _______ score, and a ____% probability that the examinee's _____ score lies within approx two _____ of the _____ scores.	68 true standard error of measurement obtained 95 true standard error of measurements obtained
	All other things being equal, a short test will have a(n) ____ reliability coefficient than a longer test, a fill-in-the-blank test will have a(n) _____ reliability coefficient than a true/falst test, and a very easy test will have a(n) ______ reliability coefficient than a moderately difficult test.	lower; higher; lower
	1. You would not use the Kuder-Richardson Formula 20 to assess the reliability of a a. test that is dochotomously scored b. test that measures an unstable attribute c. speed test d. psychological test	C. Internal consistency reliability coefficients (e.g., KR-20, coefficient alpha, aplit-half) should nto be used to assess the reliatbility of speed tests. This is because on a speed test, all attempted items are expected to be answered correctly; thus, any coefficient of internal consistency will yield a spuriously high estimate of the test's reliability.
	One way to improve the inter-rater reliability of a bx observation scale would be to use a. mutually exclusive rating categores b. non-exhaustive rating categories c. highly valid rating categories d. empirically derived rating categories	A. Inter-rater reliability is strengthened when mutually exclusive and exhaustive rating categories are used. This means that categories are clearly enough defined so that no bx will belong under overlapping categories (mutually exclusive), and that all observed behaviors can be placed into a category (exhaustive).
	The standard error of measurement is a. inversely related to the reliability coefficient and inversely related to the stand deviation of test scores. b. positively related to the reliability coefficient and positively related to the standard deviation of test scores c. positively related to the reliability coefficient and inversely related to the standard deviation of test scores d. inversely related to the reliability coefficient and positively related to the standard deviation of test scores	D. This means that the standard error of measurement increases as reliability decreases adn the standard deviation increases. This can be seen from the formula for the standard error of measurment.
	When practical, it is most advisable to use a(n) a. alternate-forms reliability coefficient b. test-retest reliabililty coefficient c. internal consistency reliability coefficient d. interscorer reliability coefficient	A. Although this opinion is not universally shared, it is what many experts believe. The words "when practical" are a good clue, since it is often very impractical to obtain an alternate forms reliability coefficient.
	According to classical test theory, an observed test score relects a. true score variance plus systematic error variance b. true score variance plus random error variance c. true score variance plus random and systematic error variance d. true score variance only	B. According to classical test theory, a given test score reflects both "truth" (whatever is being measured by the test) and measurement error (factors that are irrelevant to whatever the tset is measuring). Measurement error, which occurs because no test is perfectly reliabile, is random by definition.
	Which of the following methods of recording gx is most usefly when the target bx has no fixed beginning or end? a. interval b. continuous frequency d. duration	A. In interval recording, a rater records whether or not an individual is engaging in a target bx during a given interval. During this interval, the rater only has to decide if the behavior is occurring, not when it begins or when it ends. This is why interval recording is the best method of recording behaviors that have no fixed beginning or end.
	A test has content validity if it __________.	adequately samples the content domain it is supposed to measure knowledge of;
	Content validity is a concern when ___________ tests are being developed.	educational (or achievement, or work sample)
	To determine if a test has content validity, we rely primarily on ________.	expert judgment
	If a test has criterion-related validity, there would be a high ______ between the _____ and the _____.	correlation; predictor; criterion.
	A(n) ______ measure is a direct and independent measure of that which the predictor test is designed to predict; it can be thought of as that which is being predicted. For example, if an industrial psychologist were interested in using scores on an aptitude test to predict job peformance, the aptitude test would be the _______ and a measure of job performance would be the _______.	criterion; predictor; criterion
	When _________ validation procedures are used to validate a predictor test, predictor and criterion data are collected at or about the same time.	Concurrent
	When _______ validation procedures are used, predictor data is collected first, and criterion data are collected at a future point.	predictive
	The former type of validation is more appropriate for predictors that measure _____; the latter type is more appropriate for test designed to measure _______	current status on a criterion future status on a criterion
	Since ______ validation is less costly than _______ validation, the former is often used as a substitute for the latter.	concurrent predictive
	The ______ is a statistic used to contruct a range in which an examinee's ______ criterion score is likely to fall, given his or her ________ criterion score.	standard error of estimate actual (or true) predicted
	Say a person takes a short aptitude test that is being used as a predictor of IQ score. Say that on the basis of his score on the aptitude test, his IQ score is predicted to be 100. If the ____ were equal to 5, there would be a 68% probability that his ____ intelligence is between ____ and ____.	95 actual predicted standard error of estimate standard error of estimate
	And there would be about a 95% probability that his _____ intelligence is between _____ and _____.	actual 90; 100
	Often, a predictor is used for classification purposes. ie., to predict to which of two _____ groups a person belongs. When this is the case, the predictor is administered to examinees, and those scoring above the predictor ____ would be expected to score above the _____ on the criterion.	criterion cutoff cutoff
	For example, a job selection test might be used to predict whether or not a person will be successful at a particular ocupation. Individuals who are predicted to be successful by the test and in fact do turn out to be successful would be called ______.	true positives
	And those whom the predictor correctly identified as unsuccessful would be called ______.	true negatives
	Those who are classified by the test into the unsuccessful group but turn out to be successful on the job would be called______.	false negatives
	And finally, those whome the test predicts to be successful but in fact turn out to be unsuccessful would be called _____.	false positives
	A validity coefficient would be lowered if there was a(n)____ range of scores on either the -____ or the ______.	restricted predictor criterion
	7. After construcing & validating a test, a test developer wil likely want to re-validate it using a second sample of individuals. This process referred to as ______. In such cases, the validity coefficient obtained on the second sample is likely to be ______ than the one obtained from the first sample. This phenomenon is known as _____.	cross-validation; lower; shrinkage
	8. A test has ______ when its validity coefficient for one subgroup is higher than its coefficient for another subgroup. For ex, an IQ test may be a valid predictor of job performance for whites, but a completely invalid predictor of performance for blacks. In this ex, race would be said to be acting as a(n)________.	differential validity; moderator variable
	Costruct validity is a concern in developing tests that measure ______.	hy;othetical constructs or traits
	Two types of construct validity are the following 1) ________ validity, which is present when a test has a(n) _______ correlation with another test that measures the same trait, and 2) _____ validity, which is present when a test has a(n) ______ correlation with another test that measures a different trait.	convergent high discriminant (or divergent) low
	A(n) ______ matrix provides a method of assessing the construct validity of two or more tests. On this matrix, if the ______ coefficient (the correlation between two test which measure the same construct using different methods) is ______, evidence of ______ is provided.	multitrait-multimethod monotrait-heteromethod high convergent
	And if the ______ coefficient (the correlation between two tests using the same method to measure different constructs) is _____, evidence of _____ validity is provided.	heterotrait-monomethod low discriminant (or divergent)
	______ is a procedure designed to determine the degree to which a large set of variables or test are measuring the same underlying construct or constructs.	Facotor analysis
	The proecedure yeilds a(n) _____, which indicates each test's correlation with each factor identified in the anlysis (a correlation between a test and a factor is referred to as a(n) _____.)	factor matrix factor loading
	To facilitate interpretation of a factor analysis, a(n) ______ is usually performed, and there are two types: _____ and ______.	rotation orthogonal oblique
	When a(n) ______ is conducted, uncorrelated factors are derived, and when a(n) _____ is conducted, correlated factors are derived.	orthogonal rotation oblique rotation
	If, in a factor analysis, factors are ______, the ______ of a test can be obtained by squaring and summing the ______.	orthogonalcommonality factor loadings
	For example, imagine a factor analysis of six tests which yeilded two significant factors. Imagine that Test A has a .60 correlation with Factor I and a .20 correlation with Factor II. By squaring and summing these ______ (assuming the rotation is ______), we can determine that the communality of Test A is ____. This means that _____% of the variability in Test A is explained by _______.	factor loadings orthogonal .40 the two factors
	If a test is highly reliable it (will be/may be/will not be) valid. If a test is very valid, it (will be/may be/will not be) reliable.	may be will be
	In other words, reliability is a(n) ______ but not a(n) ______ condition for validity to present. The ____ formula would be used to determine how ____ a test would be if it had _____ reliability.	necessary sufficient correction for attenuation valid perfect
	1. Which of the following is the lowest validity coefficient? a. .80 b. .50 c. .10 d -.15	C. Like any other correlation coefficient, the magnitude of the validity coefficient is determined by its absolute value rather than its direction (i.e., positive or negative). To answer this question, look at the numbers and ignore any negatie signs. Since .10 is the lowerst number, it is, of the choices listed, the lowest validity coeffient.
	2. If an indistrial psychologist were concerned about reducing the number of false positives yielded by a job selection test, he could a. raise the predictor cutoff score and/or raise the criterion cutoff score. b. raise the predictor cutoff score and/or lower the criterion cotoff score. c. lower the predictor cutoff score and/or lower the criterion cutoff score. d. lower the predictor cutoff score and/or raise the criterion cutoff score.	B. False positives can be reduced by raising the predictor cutoff score. If teh selection test becomes more difficult to succeed on, there will be fewer individuals who "pass," and those who do pass are less likely to be unqualified. Lowering the criterion cutoff score will also result in fewer false positives. Lwoering the criterion cutoff is equivalent to relaxing the definition of acceptable performance. This means that it will be easier to be considered adequate; therefore, those who do "pass" the selection test will be more likely to be able to meet this easier criterion standard.
	5. Some would argue that, in conducting a factor analysis, an oblique rotation is usually preferable to an orthogonal rotation because a. few factors are uncorrelated b. most factor analyses identify distinct and unrelated traits c. oblique rotations involve simpler calculational procedures d. oblique rotations are more fun	A. By definition, an oblique rotation produces correlated factors. In other words, if you believe that the traits represented by the factors are correlated, it makes theoretical sense to use an oblique rotation. And if you believe that few factors or traits are ever uncorrelated, you might argue that oblique rotations should always be used.
	If a test has a reliability coefficient of .90, we can conclude that a. the highest validity coefficient the test could have is .81. b. its validity coefficient is equal to teh square root of .90. c. the test is probably very valid. d. the test may or may not be valid.	D. Knowing that the test's reliabilty coefficnet is .90, tells us that teh upper limit of the validity coefficient is the square root of .90 (not .81, which is the square of .90). This means that the test's validity is lower than or equal to the square root of .90. The test may be highly valid, mod valid, or completely invalid.
	If a test's validity coefficient were -1.0, the standard error of estimate would be equal to a. 0.0 b. 1.0 c. the standard deviation of criterion scores d. cannot be determined	A. This makes sense. If a test has perfect validity (a validity coefficient of 1.0 or -1.0), there is no error of estimate, or no error when the test is used to predict score on a criterion measure. The anser can also be derived through the formula for the standard error of estimate. Using this formula, you can see that a validity coefficient of -1.0 will always result in a standard error of estimate of 0.
	9. Criterion contamination has the effect of a. increasing the validity coefficient b. decreasing teh validity coefficient c. increasing examinees' criterion score d. decreasing examinees' criterion scores	A. Criterion contamination occurs when raters assigning criterion scores have knowledge of the ratees' predictor scores, adn their knowledge affects scores on the criterion. If a supervisor knows that an employee got a low score on a predictor, he might rate the employee lower on the criterion than he normally would have. This results in an artificially high consistency between predictor and criterion scores and inflates the validity coefficient.
	10. Fjollowing a prinicipal components analysis of a set of variables, four eigenvectors, symbolized in order as v1, v2, v3, and v4, are derived. Which of the tollowing statements is true? a. v1 will account for more variance in the variables than any of the other eigenvectors. b. v1 will account for more variance in the variables than all the other eigenvectors combined c. v1, v2, v3, and v4 will each account for the same amount of variance in the variables. d. v4 will account for more variance in the variables than any other eigenvector	A. In a principal components analysis, eigenvectors (which are also called factors or principal components) represent underlying traits or constructs that are being measured by some or all ofthe variables being analyzed. In principal components analysis, the first factor accounts for high percentage of variance than any ot\f the other factors. This just means that the variables in the analysis measure the first factor more than they measure any of the other factors.
	By definition, an oblique rotation produces ________ factors.	Correlated
	If you believe that the traits represented by the factors are correlated, it makes sense to use an ________.	oblique rotation
	If you believe that few factors or traits are ever uncorrelated, you might orgue that _______ ______ should always be used.	oblique rotations
	A ________ _______ is an examinee who is identified by a predictor as not meeting a criterion but, in reality, does meet it.	false negative
	Usually, when a test is developed, a large pool of _____ are written, and _____ is used to determine which items will be retained for the final version of the test.	items item analysis
	A test item's difficulty level (p) is equal to the ________.	percentage of examinees who answer the item correctly.
	On most tests, the optimal average difficulty level is ______; this level is associated with maximum ____ and ______.	.50 reliability differentiation or discriminability (score variability would also be correct)
	However, the optimal difficulty level depends on _______. For example, if the test is designed to select only a few highly qualified individuals, one should set the average p value at a relatively (high/low) level. It's imp to remember that the higher the p value, the (more difficult/less difficult) the item.	purpose of testing (or the probability that items can be guessed) low less difficult
	A test item's discrimination refers to the degree to which the item ______.	differentiates among examinees in terms of what the test measures;
	One way to assess an item's discrimination is to correlate each item with either ______ or _______.	the total test score an external criterion
	An item's discrimination index (D) is equal to the percentage of _____ (U) minus the percentage of ________ (L); a value of ______ represents maximum discriminability.	examinees in the high-scoring group examinees in the low-scoring group 100 or -100
	Higher levels of discrimination are associated with _____ levels of difficulty.	moderate
	The item difficulty level associated with the maximum level of differentiation among examinees is 1. .10 b. .50 c. .75 d. 1.0	b. An item is most likely to differentiate among examinees (e.g., between high and low scorers) when half the examinees answer it correctly and half answer it incorrectly. The item difficult level (p) of .50.
	The optimal average item difficulty level for a true-falst test would be a. .10 b.50 c. .75 d. 1.0	C. For most tests, the optimal item difficulty level is .50. Hwoever, the optimal difficulty level is affected by the probability that examinenees can select the correct answer by chance alone. When considering the effects of chance, the rule-of-thumb is that the average difficulty level of test items should be about halfway between 1.0 and the level of success expected by chance alone. On a true-falst test, the probability of getting an item correct by chance alone is 50%. Therefore the optimal item difficulty level would be midway between 50% and 100%, or 75%``
	A test item's difficulty level is most affected by a. thes test's length b. the test's validity c. the natuer of the testing process d. the characteristics of the individuals taking the test.	D. A test item's difficulty is measured in terms of the percentage of examinees who answer the item correctly. Therefore, the characteristics of the individuals taking the test will influence the observed difficulty level. For ex., if all examinees taking an intelligence test are highly gifted, the difficulty index (p) for items will be inflated. That is, test items will be estimated to be easier than they actually are.
	Which of the following statements is least true of item response theory? a. It is based on the notion that items analyzed measure a latent trait such as cognitive ability. b. It allows for the ability levels of diff groups of people to be compared, even if the groups are tested using a diff set of items. c. it applies best to large samples d. It is based on teh notion that the characteristics of an item will be different depending on the characteristics of the sample of individuals tested.	d. One assumption of item response theory is that item parameters (characteristics of items such as diff level and discrimination) will be the same regardless of the sample of individuals taking the test. The other statements are true of item reponse theory.
	An examinee's _______ test score is not that meaningful unless a frame of reference is provided for score interpretation.	raw
	Two types of scores which provide this frame of reference are _______ scores, which provide a comparison of an examinee's score to that of others who have taken the same test, and ______ scores, which provide a comparison to an external, pre-established standard of performance.	norm-referenced; criterion-referenced
	2. Norm-referenced scores include ______ scores, which indicate how far along the normal path of development an examinee is.	developmental
	A(n) _____ IQ score is an example of such a score.	ratio
	They also include within-group norms such as _______, which indicate the percentage of scores that fall below a given raw score, and _______, which indicate where a given score stands, in standard deviation units, in relation to the mean.	percentile ranks standard scores
	There are a number of different types of ______, including z-scores, _______, _______, and ______, all of which provide essentially the same information.	standard scores T-scores stanines deviation IQ scores
	1. Percentile ranks and T-scores have which of the following in common? a. They are both standard scores b. They are both norm-referenced scores. c. They are both developmental scores. d. They are both criterion-referenced scores.	B. Percentile ranks and T-scores are both norm-referenced scores; that is, they are both interpreted in terms of a comparison to the scores of those in a normative group. A T-score (but not a percentile rank) is also a standard score, which is a norm-referenced score that is interpreted in terms of distance, in standard deviation units, from the mean of a normative group.
	You work as an assistant for a psychology professor at a university and have administered and scored a mid-term exam for him. You report students' score on the exam as z-scores. The prof tells you that "This makes no sense; I'm used to the MMPI and I can only understand T-score." In this case, you should explain to the prof that a. The test will have to be re-administered so that T-scores can be obtained. b. converting T-scores to z-scores will require complex calculational procedures c. z-scores and T-scores yield essentially the same information d. Anyone who can't understand z-scores......	C. T-score and Z-score are both standard scores, which means they are both interpreted in terms of distance from the mean in standard deviation units. For example, a z-score of 1.0 and a T-score of 60 are equivalent - they both indicate that teh score is one standard deviation unit above the mean.
	3. The formula "X-M/s.d." is the formula for a a. standard score b. percentile rank c. criterion referenced score d. T-score	A. This is the formula for a z-score, whci is a typeof standard score.
	4. The advantage of a deviation IQ score, as compared to a ratio IQ score is that it a. provides an index of an examinee's absolute level of intelligence b. indicates an examinee's mental age c. alllows scores of individuals who are the sa age to be compared d. allows score comparisons to be made across age levels	4. D. A deviation IQ score is a standard score, which means that it tells you how many standard deviation units an examinee's score falls above or below the mean. An advantage of a standard score is that scores of individuals from different populations (and on diff tests) can be compared. A 9 year old's deviation IQ score can be meaningfuly compared to that of a 30 year old.
	1. Decreasing a test's inter-item consistency makes a test a. less valid b. less reliable c. more valid d. more reliable	B. One measure of reliability of a test is how homogeneous or internally consistent items are (as measured by coefficient alpha or the Kuder-Richardson Formula) Therefore, decreaseing iter-item consistency makes a test less reliable.
	4. In the validation of a selection test for graduate school entrance, the highest validity would be shown if scores on the test were correlated with actual school grade of a. the lowest scores b. only the middle range of scores c. all those admitted d. only the highest scores	C. Any correlation coefficient, including a validity coefficient, will be lowered when there is a restriction in the range of scores in one or both variables. Choice C would provide the highest range of scores and givee the highest validity coefficient.
	5. Which of the following statements is true about the relationship between reliability and validity? a. A valid test will always be reliable b. A reliable test will always be valid c. Teh validity coefficient sets a ceiling on the reliability coefficient d. The validity coefficient is equal to the square root of the reliability coefficient	A. Only a is true - for a test to be valid, it must be reliable; however the opposite is not true - a test can be reliable without being valid. The reliability coefficient sets a ceiling on the validity coefficient. The upper limit of the validity coefficient is equal to the square root of the reliability coefficient.
	6. If you want to determine the degree to which an obtained test score is likely to deviate from teh true test score, you would use a. the standard error of estimate b. the standard error of measurement c. teh standard error of the mean d. teh standard error of judgment	B. According to classical test theory, any given score on a test reflects both true score variance (the actual char being measured by the test) and random error. In other words, an obtianed test score is likely to differ from the true test score to a degree that depends on how mych error the test contains. The standard error of measurement is used to construct a range in which an examinee's true test score is likely to fall, given her obtained test score.
	7.If a validity coefficient is equal to 0.0, the standard error of estiamte will be equal to a. 0.0 b. the validity coefficient c. the standard deviation of predictor test scores d. the standard deviation of criterion scores	D. You can swer this question by using the formula for the standard error of estimate. Using this formula, u can c that if the validity coefficient is 0, the standard error of estimate comes out to be equal to the standard deviation of criterion scores.
	8. Which of the following is the highest score? a. the z-score = 2.0 b. T-score = 75 c. WAIS score = 120 d. stanine score = 7	B. To answer this question, u need to convert each choice to a common metric. Since all choices are standard scores, they can easily b converted to z-scores, which direclty indicat how many standard deviation units above or below the mean the score falls. Since T-score have a mean of 50 and a standard deviation of 10, a T-score of 75 is equivalent to a z-score of 2.5. WAIS scores have a mean of 100 and a standard deviation of 15; therefore a score of 120 is equivalent to a z-score of just over 1.0 (1.33 to be exact). And a stanine has a mean of 5 and a standard deviation of about 2; thus a stanine of 7 is equivalent to a z-score of about 1.0. In other words, a T-score of 75 represents the highest score.
	9. If as a test developer, you wish to maximize a test's reliability coefficient, you would set the test's average item difficulty level at _____. On the other hand, if you were interested in developing a test to select only very highly qualified individuals for a job, the optimal average item difficulty level would be around _____. a. .50; .15 b. .50; .80 c. .25; .80 d. .75; .25	A. If the average item difficulty level is moderate (around .50) which means that, on the average, half the examinees answer the items correctly), score variability and therefore reliability will be increased. However, if the goal of testing is to have only highly qualified individuals pass, one would want to make the questions fairly difficult. An item difficulty level of .15 means that only 15% of the examinees answer items correctly. This would be an appropriate average difficulty level if one were attemtping to select only very qualified individuals.
	10. A multi-trait-multimethod matrix would most likely be used to assess a. concurrent and predictive validity b. content validity c. face validity d. discriminant and convergent validity	D. Constructing a multitrait-multimethod matrix allows one to assess the convergent and discriminant validity of two or more tests. Convergent and discriminant validity are types of construct validity.
	11. A test developer conducts a thorough job analysis as part of the process of developing a work sample test that will be used as a sselection tool. The job analysis reflects the developer's concern with a. content validity b. criterion-related validity c. construct validity d. face validity	A. a job analysis is conducted to determine what tasks a job consists of and what typesof skills are needed to perform the job well. Job analyses are sometimes used as part of the process of constructing work sample tests, which must have good content validity, ie.., they must provide a presentative sample of the tasks involved on a job.
	The diff between coefficient alpha and the Kuder-Richardson Formula 20 is that a. coefficient alpha is an internal consistency reliability coefficient b. KR-20 provides an index of inter-item consistency c. KR-20 is used when items are scored dichotomously d. coefficient alpha is the probability of making a Type I error	C. Both KR-20 and coefficient alpha are internal consistency reliability coefficients, and both provide an index of a test's average degree of inter-item consistency. KR-20 is used when the test is dichotomously scored (e.g., yes/no, right/wrong), whereas coefficient alpha is used when the test isnot dichotomously scored.
	13. The test-retest reliability coefficient would be inappropriate for all of the following, except a. a test of day to day fluctuations in mood b. a short math test on which examinees are able to improve through practice d. a test designed to screen for Brief Psychotic Disorder d. a speeded test	D. A test-retest coefficient should not be used to assess the reliability of test that measure unstable traits, such as mood or Brief Psychotic Disorder. it should also not be used for tests on which scores are affected by practice or repeated administration. It can, however, be used for a speed test.
	14. One hundred job applicants are given a job selection test. Of the 60 people who are selected for the job on the basis of their results on this test, 10 turn out to be unsuccessful on the job. Of the 40 who are not hired on the basis of their test results, 15 would have been successful if they had been hired. In this case, there were how many false positives? a. 10 b. 15 c. 25 d. 50	A. In this situation, a false positive is someone who succeeds on the predictor but does not turn out to be successful on the job. There are 10 such individuals.
	15. Upon cross-validation of a selection test, shrinkage usually occurs because a. the test developer does not include enough items in the original item poor b. the chance factors present in the original validation sample are not likely to be present in the cross-validation sample c. the characteristics of individuals in the original validation sample are too heterogeneous d. validity coefficients always shrink in the wash, esp when they are made of cottom	B. When a predictor is validated on a second sample (i.e., upon cross-validation), the validity coefficient obtained will likely be lower than the one originally obtained on the first sample. This is known as shrinkage. Shrinkage occurs becuase chance factors that increase the validity coefficient in thefirst sample are not present in the second sample.
	16. To ensure adequate interscorer reliability, it is imp to use bx ovservation scales that a. have overlapping rating categories b. give raters flexibility in scoring behaviors d. are very valid for predicting future behaviors d. have mutually exclusive and exhaustive rating categories	D. To ensure interscorer reliability, it is best to have mutually exclusive and exhaustive rating categories. This means that all behaviors fit into one and only one category.
	The technique that would be most helpful in identifying a typology of substance abuses is a. discriminant function analysis b. factor analysis c. cluster analysis d. cross-validation	C. Cluster analysis is a technique used to develop a taxonomy, or classification system, of objects or people. For instance, if you wanted to identify subtypes of substance abusers, you could use a cluster analysis to do so.
	18. In principal axis factor analysis, an eigen value indicates a. the correlation between a test and a factor b. the correlation between two factors identified by the analysis c. the total amount of variability in a test accounted for by all the factors in the analysis d. the total amount of variability in all the tests accounted for by an unrotated factor.	18. D. In factor analysis, an eigenvalue is a statistic that measures the explanatory power of a factor. More technically, it indicates how much variability is accounted for by an individual factor. Choice A describes a factor loading; choice C describes communality; and there isn't a part name for what is described by choice B. Don't be thrown by the term "prinicipal axis factor analysis" - on the exam, you won't have to distinguish between diff types of factor analysis.
	The central limit theorem predicts that a sampling distribution of means will increasingly approach a normal shape: a. regardless of the shape of the pop distribution as the sample size increases b. regardless of the shape of the po dist as the number of samples increases c. only when the pop dist does not deviate from the norm d. only when the sample dist do not deviate significantly from the normal	a. regardless of the shape of the pop dist as the sample size increases
	The probability of making a Type 1 error is increased by: a. conducting a single multivariate test rather than several univariate tests b. changing the level of significance from .01 to .05 c. changing beta from .01 to .05 d. conducting a two-tailed rather than a one-tailed test	b. changing the level of significance from .01 to .05
	An investigator wants to test the hypothesis that the average number of aggressive acts that children exhibit in an unfamiliar situation jis related to gender and sociability. He obtains a sample of 30 boys and 30 girls who have been rated as either sociable or shy and then has observers count the number of aggressive acts each child exhibits in an unfamiliar situation during a 30 minute play period. The best stat test to analyze the data the investigator colelcts in this study is which of the following: a. t-test for independent samples b. chi-square test c. one-way ANOVA d. two-way ANOVE	d. two-way ANOVA - two independent variables (sociability and gender) and a singld dependent variable that is measured on a ratio scale (number of aggressive acts).
	Use of the chi-square test to analyze data collected from a study making use of a single-group time-series design: a. is contraindicated because the test's statistical power will be reduced b. is contraindicated because the test's assumption of independence has been violated c. is contraindicated because the test's assumption of homogeneity has been violated d.	b. in contraindicated because the test's assumption of independence has been violated
	The split-plot ANOVA is appropriate when: a. a study has two or more IVs but not all subjects receive all combinations of the IVs b. a study has at least one between-groups IV and one within-subjects IV c. one of the IVs included in a study is an extraneous variable	b. a study has at least one between-groups IV and one within-subjects IV
	Trend analysis a. cannot be used if the IV is measured on an interval or ratio scale b. cannot be used when intervals between adjacent points are unequal c. can be used whether the relationship between variables is linear or nonlinear d. is useful when the IVs are highly correlated	c. can be used whether the relationship between variables is linear or nonlinear - trand analysis is a type of interential statistic that enables a researcher to determine if a quantitative independent variables has a linear, quadratic, cubic, or quartic effect on the dependent variable
	The appropriate correlation coefficient when the variables of interst are both measured on a continuous scale and have a U-shaped relationship is: a. eta b. biserial c. phi d. Spearman	a. eta - appropriate correlation coefficient when the relationship between two variables is nonlinear
	A linear structural relations analysis (LISREL) is useful for: a. developing a causal model for the relationships among a set of variables b. testing the veracity of a causal model for the relationships among a set of observed variables c. testing the veracity of a causal model for the relationships among a set of observed and latent variables d. identifying the linear combination of two sets of variables that produces the highest correlation coefficient	c. testing the varacity of a causal model for the relationships among a set of observed and latent variables - LISREL, like path analysis, is a structural equation modeling technique that is used to test causal hypotheses or models about teh relationships among a set of variables. In contrast to path analysis, LISREL takes into account both observed (measured) variables and the constructs (latent variables) they are believed to measure
	Counterbalancing is useful for controlling a. demand characteristics b. selection biases c. history effects d. order effects	d. order effects - counterbalancing is used in repeated measures designs in which each subject will receive all levels of the independent variables. It helps control order effects (mjultiple tx interference) by administering the levels in different orders for different subjects
	When using a single-group pretest/posttest design, reducing the interval of time between administration of the pre- and posttest will most likely reduce which of the following threats to internal validity: a. hx b. selection c. maturation d. testing	c. maturation - maturation refers to changes that occur within subjects simply as the result of the passage of time. REducing the length of time between pretesting and posttesting helps reduce its effects
	When using an ABAB design, you are: a. administering two diff tx at two diff times b. administering one treatment at two diff times c. administering one tx to two diff groups d. administering one tx to two diff tx	b. administering one tx at two diff times. The ABAB design has two no treatment (a) phases and two tx (B) phases. The same treatment is administered during the B phases.
	You are designing a study to investigate high-risk sexual behavior among urban adolescents. When selecting your sample, you make sure it includes proportions of whites, Hispanics, and AA comparable to their population proportions. This will help ensure that your study has adequate: a. incremental validity b. discriminant validity c. internal validity d. external validity	d. external validity
	School consultation would most likely be targeted at a. the school superintendent b. students c. teachers d. psychologists working with children	c. teachers - more effective when targeted at individuals who work with students rather than the students themselves.... parents too.
	The case of larry P. V. Riles dealt with a. the use of aptitude tests in the classroom placement of minority children b. the rights of parents to access school records c. the rights of handicapped children to have an individual education plan (IEP) constructed d. the privacy rights of students	A. parents claimed racial bias... the ca court banned the use of IQ tests as a criterion for placement of children in EMR classes
	In many schools, children are placed into classes with other childen with a similar ability level. Research suggests that this practice a. is associated with positive effects on learning ecause it allows children to proceed at their own pace. b. is associated with positive psych effects on children's learning because it increases their sense of identity c. is for low achieving children, associated with negative effects on academic achievement as well as negative effects on self-esteem and motivation	d. this practice, inown as ability tracking, has negative psych and academic effects on low-achieving children, as well as on moderate-achieving children
	wHILE MALES AND FEMALES ARE GENERALLY EQUAL IN TERMS OF INTELLIGENCE, ONE AREA IN WHICH THEY HAVE CONSISTENTLY BEEN FOUND TO DIFFER IS THA A. MALE VERBAL SKILLS R BETTER B. FEMALE VERBAL SKILLS R BETTER C. FEMALE VERBAL sat SCORES ARE MUCH HIGHER D. MALE MATHEMATICAL SKILLS R BETTER	B. FEMALE VERBAL SKILLS ARE BETTER
	Research investigating the relationship between the race of teh examiner and AA children's scores on the WISC has shown that a. AA children do better if the examiner is also AA b. AA children do better if the examiner is Caucasian c. AA children do better if the examiner is also AA d. the race of the examiner is not related to the child's performance	d. the race is not related to the performance
	It would be least appropriate to administer the performance subtests only of the WISC-III to a. reading-diabled children b. suburan middle class children c. immigrant, non English speaking children	b. surbab middle class children
	Which of the following WAIS subtests is the best measure of short-term memory? a. picture attangement b. icture completion c. block design d. digit span	duh
	If u needed an approx of the level of cognitive functioning of a 12yo girl who was hearing impaired, an appropriate test to use for the initial screening would be the a. stanford binet b. peabody picture vocabulary test c. peabody individual achievement test d. leiter international perf scale	d. leiter international performance scale
	The correlation between a child's and her parent's IQ is approx: a. 20 b. 30 c. .50 d.70	d.50
	You wish to assess the general cognitive reasoning ability of a newly-arrive dimmigrant who has no English lang skills. The most appropriate test to use for this person would be the a. Otis-Lennon Mental Ability Test b. Slosson Intelligence Test c. Merrill-Palmer Scale d. Ravens Progressive Matrices	d. Ravens Progressive Matrices
	Terman is best known for a. developing culture-fair intelligence tests b. adapting the binet scales for use with English-speaking children c. deriving stats to meas individual diff	b. adapting the Binet intelligence scale for use with English-speaking children
	Research with elementary school teachers suggests that, on the average, a. they tend to pay more attention to boys than girls b. they tend to pay more attention to girls than boys c. male teachers tend to pay more attention to boys, while female teachers tend to pay more attention to girls d. they pay an equal amt of attention to girls and boys	a. they tend to pay more attention to boys than girls
	Research investigating the effectiveness of cooperative learning programs suggests that such programs are a. effective for low-achievers but not high achievers b. effective only in racially homogenous classrooms c. effective for white but not AA children d. effective	d. effective, regardless of the characteristics of classrooms or students
	Which of the following is true of rapists? a. Rapists typically achieve multiple orgasms during the rape b. rapists are typically sexually dysfunctional during the rape c. rapists use aggressive methods to satisfy sexual needs d. rape victims often report that rapists are strange and peculiar looking	b. rapists are typically sexually dysfunctional during the rape
	Which of the following statements is most true of long-term unemployment? a. it is assoc with a variety of physical and psychological probs b. it is assoc with a varieyt of physical and psycho probs for men but not for women d. it has been shown to improve psycho functioning in some cases, because economic status is actually negatively corr with happiness and well being	a. it is assoc with a variety of physical and psycho probs
	A client who has been depressed for a long time appears to be improving. He mentions to you that he has thought about suicide. You should a. have him involuntarily hospitalized b. make a no suicide contract with him c. find out whether he has a specific suicide plan	c. find out whether he has a specific plan
	Which of the following is the greatest indicator of suicide risk? a. depression b. hopelessness c. ambivalence d. psychosis	b. hopelessness
	You develop a program for pre-school children who are at risk for learning disabilitis. This is an example of a. primary prevention b. secondary prevention c. tertiary prevention d. advocacy consultation	a. primary prevention - administered before the onset of a prob and designed to prevent development.
	You set up a program for inmates who are leaving prison, in order to lower their recidivism rates. This is an example of a. primary prevention b. secondary prevention c. tertiary prevention d. advocacy consultation	c. tertiary prevention - occurs after a problem has run its course - to prevent relapse
	A teenage girl has made some half-hearted suicide attempts. Of the following, which intervention is most indicated for her? a. psychotropic med b. discussing healthier ways to get attention c. interpretation of her wish to die as aggression turned inward d. involuntary hospitalization	b. discussing healthier ways for her to get attention
	Which of the following is the best example of consultee-centered case consultation? a. A therapist seeks consultation bec he needs some advice regarding what specific techniques will work best with a severely depressed patient he is tx. b. a therapist seeks consultation to help prevent her own depression from impairing his effectiveness with patients c. a psych seeks consultation for advice on how to obtain funcing d. the parents of a group of learning disabled children seek consultation for some help obtaining govern support....	b. a therapist seeks consultation to help prevent his own depression from impairing his effectivenesss with patients
	Which of the following is the best example of advocacy consultation? a. a therapist seeks consultation bec he needs some advice regarding tx b. a therapist seeks consultation to help prevent his own depression from impairing... c. a psych seeks consultation for advice on how to obtian funding d. the parents of a group of learning disabled children seek consultation for some help obtaining govern support for special services for learning disabled children in the school.	d.
	The phenomenon of "theme interference" in an organization is most analogous to the phenomenon of _____ resistance b. catharsis c. premature termination d. transference	d. transference - Gerald Caplan uses the term to refer to a type of transference that may be a focus of consultation in organizations. It occurs when an unresolved conflict, related to life experiene or fantasy affects his or her perception or handling of a work-related problem
	When they seek psychotherapy, many physically abused women report that they do not want to leave their husbands. A common reason for this is that a. the woman has an adequate support system b. The woman and her husband are in a honeymoon stage in which the husband aplogizes and promises to change d. the woman has a self-defeating personality disorder	b. honeymoon stage
	In most states, the legal criteria used in determining whether or not a person can be involuntarily hospitalized has to do with a. whether or not the person has a mental disorder b. the person's ability to regulate his or her own life d. danger to self or others	duh
	The duration of post-traumatic amnesia: a. is unrelated to the severity of the injury b. is useful as an indicator of severity only when combined with the degree of retrograde amnesia c. is less accurate as an indicator of severity than the degree of retrograde amnesia d. is more accurate as an indicator of severity than the degree of retrograde amnesia	d. is more accurate as an indicator of severity than the degree of retrograde amnesia
	Disturbances in the ability to respond to stimulation on one side of the body is suggestive of damage to the a. right parietal lobe b. right frontal lobe c. anterior cerebellum d. posterior pons	a. right parietal lobe
	A psychologist wants to determine the relationship between number of years as a sustance abnuser and number of weeks in an inpatient tx facility. The psych will use which of the following: a. MANOVA b. ANOVA c. Pearson Product Moment Correlation Coefficient d. Kendall's Coefficient of concordance	c. Pearson product Moment Corr Coeffficient
	the psychologist in the above study plans to use the info she has collected to predict the number of aftercare sessions patient will require. The appropriate technique in this situation is: a. regression analysis b. multiple regression analysis c. discriminant analysis d. canonical correlation	b. multiple regression analysis
	In primary school, ability tracking: a. has beneficial effects on the achievement of high, mod, and low achieving students b. has no effect c. has deterimental effects d. has different effects on achievement for students of different ability levels	d. has diff effects on achievement for students of diff ability levels
	Research investigating father-child attachment suggests that it depends most on	play activities
	Characteristics such as level of emotionality, activity, and sociability: a. are evident in newborns and remain relatively stable in later years b. are evident in newborns but are not predictive of future bx c. begin to appear as traits after the first year of life d. do not become stable until the preschool years	a. are evident in newborns and remain relatively stable in later years
	Hans Eysenck's article about the effectiveness of psychotherapy reported ittle benefit of psych beyond what would be expected from spontaneous remission. Subsequent research has shown: a. a different pattern - there are benefits to psychotherapy	yeah
	A depresssed patient is concerned that taking an antidepressant will produce sedation and interfere with his ability to perform his job and cause him to put on unwanted weight. What to do?	SSRI
	According to Piaget, the source of motivation for cognitive development is: a. social acceptance b. parental influence c. equilibration d. the collective unconscious	c. equilibration
	A therapist in NY decides to use cuento therapy in her work with elementary school children whose parents are Puerto Rican immigrants. She collects several Puerto Rican folktales and modifies them to make them apply to the innter city environ... this practice is: a. suspect because b. a form of American ethnocentrism c. effective based on the research	c. effective based on the research
	Damage to the hippocampus is most likely to interfere with the ability to: a. recall remote events b. recall events stored in recent long term memory c. transfer short-term memory to long-term memory d. retrieve implicit memories	c. transfer short-term memory to long-term memory
	The left hemisphere of the cerebral cortex is dominant for speech and language functions: a. for most left-handers but few right-handers b. for nearly all left-handers and many right handers c. for nearly all right handers and the majority of left handers d. for nearly all right handers but a small minority of left handers	c. for nearly all right handers and the majority of left handers
	The symptoms of numbness, weakness, tremor, and ataxia that characterize multiple schlerosis are due to a. lesions in the basal ganglia b. demyelination c. loss of ACh receptors d. cerebellar atrophy	b. demyelination
	In recent years, psychologists have attempted to become more sensitive to the uniqueness of each culture. This is most related to a. emic approach b. etic approach c. etic-emic synthesis d. neither	a. emic approach
	Cleo and Cleopatra obtain percentile ranks, respectively, of 48 and 92 on a math test. If four points is subtracted from each of their raw scores (due to scoring error) but not from the scores of the other examinees, you would expect: a. Cleo's percentile rank will decrease more than Cleopatra's. b. Cleo's percentile rank will decrease less than Cleopatra's c. Cleo and Cleopatra's percentile ranks will decrease by the same amount d. Cleo and Cleopatra's percentile ranks will not change	a. Cleo's percentile rank will decrease more than Cleopatra's in a percentile rank districution, scores are evenly distributed throughout the distribution. Consequently, when converting raw scores to percentile ranks, small differences in the middle of the raw score distribution look larger in terms of percentile ranks than the same differences at the extremes of the distribution
	A test is likely to have adequate "floor" when it: a. provides only two response options b. provides four response options c. contains a sufficient number of easy items d. contains a sufficient number of difficult items	c. contains a sufficient number of easy items - needs to discriminate between people at the low end of the distribution.
	Nathan obtained a grade equivalent score of 7.2 on Test A and a grade equivalent score of 6.2 on Test B. If Nathan's percentile rank on Test A is 65, you can conclude that his percentile rank on Test B: a. is equal to 55 b. is less than 65 c. is less than 55 d. cannot be determined	d. cannot be determined
	A psychologist is hired by a large org to conduct a survey to determine people's feelings about its products, proposed changes to some of its products, and possible new marketing strategies. The psychologist decides to begin by mailing a survey to a random sample of consumers. What is the biggest threat to the internal validity of the study's results?	internal validity refers to the ability to derive valid, or accurate, info from a study. Common sense suggests that the biggest prob would be knowing whether or not the people who respond to a survey differ in any imp way from people who don't respond.

Share This Flashcard Set

Set the Language

Research Design & Statistics, Test Construction

Add to Folders

Upgrade to Cram Premium

Related Essays

Card Range To Study

322 Cards in this Set