Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

74 Cards in this Set

  • Front
  • Back
Internal Validity
Permits conclusion that there is a relationship between IV and DV
Threats to Internal Validity
History (external events)
Previous testing
Regression toward the mean
Experimenter Expectancy
Best way to increase internal validity
Random assignment
External Validity
Permits generalizability of results
Threats to External Validity
Interaction between selection and treatment (rx doesn't generalize to same pop)
Interaction between testing and rx (rx only works if there's a pretest)
Hawthorne Effect (tendency of Ss to beh differently when being observed)
Order effects (repeated measures studies)
Types of Designs
True experimental (Ss randomly assigned to IV)
Quasi-experimental (Ss not randomly assigned)
Correlational (Vs not manipulated and no causal relationship assumed)
Developmental research (assessing Vs as a fx of dev over time, longitudinal
Types of Designs cont.
Time Series (DV measured several times at regular before and after rx is administered)
Single subject (ABA, ABAB)
Qualitative (descriptive)
Scales of measurement
Nominal (unordered categories; gender)
Ordinal (ordered, rank)
Interval (successive but no absolute 0; IQ)
Ratio (has absolute 0; weight, time)
Parametric Stats Assumptions
Normal distribution, homogeneity of variance, independence of observations
Parametric stat, comparison of 2 means
One-way ANOVA
Parametric stat, one DV and more than two groups, yields an F value, determining whether population means differed
Post hoc tests
Used in one-way ANOVA, ex. Tukey and Scheffe, pinpoints exact pattern of differences of among means b/c F doesn't do this
most conservative of post-hoc tests, minimizes Type I error but highest Type II error
Type I error
Finding a difference when there isn't one
Type II error
Finding no difference when there is one
Factorial ANOVA
2 or more IV and one DV
Can't interpret main effects when there's an interaction
multiple DVs and at least one IV
Nonparametric Stats
For nominal or ordinal data, distribution free, less powerful, includes Chi Square, Mann Whitney U, Wilcoxon Matched Pairs, Kruskal Wallis
Chi Square
Used to analyze nominal data (compares observed frequencies of observations within nominal categories to freqs that would be expected under the null)
Cautions: all observations must be independent (no before/after study)
Mann Whitney
Compare two indep grps on a DV measured with rank ordered data
Alternative to t-testfor independent samples if nonparametric data
Wilcoxon Matched Pairs
compare 2 correlated gps on DV measured with rank ordered data
Alternative to t-test for correlated samples if non-parametric
Kruskal Wallis
Compares 2/more indep grps on a DV with rank ordered data
Alternative to one way ANOVA
Negative skew
Most scores are high (to the right), but a few extreme low scores.

Mean is lower than median, median lower than mode

Means easy test, ceiling effects
Positive skew
Most scores are low (to the left), but a few extreme high scores

Mean is higher than the median, median higher than the mode

Difficult test; floor effects
Average of sq differences of each observation from the mean
Standard Deviation
Sq Rt of the variance
Divide distribution into 9 = intervals, with 1 lowest and 9 highest
Standard Error of the Mean
Provides index of expected inaccuracy of sample mean
Statistical Decision Making
Four Possibilities:
1) True null retained (correct, no difference between IV)
2) True null rejected (incorrect; Type I error; say difference when isn't)
3) False null rejected (correct; find difference that does occur)
4) False null retained (incorrect; Type II error; there is a difference)
One-tailed test
Predict that direction of means differ
Two Tailed test
Don't predict direction of difference
Probability of rejecting null H when it's false (probability of not making a Type II error)

Increases by largr N, 1 tailed test
Pearson r
correlation between two continuous variables
Square Pearson r
Percentage of variability i one measure that is accounted for by variability in other measure (coefficient of determination)
Point-biserial coefficient
correlates one continuous V with one dichotomous V
Phi coefficient
corerlates 2 dichotomied Vs
Spearman's rho
correlates 2 rank ordered Vs
When 2 Vs correlated, constructs equation to est the value of a criterion (outcome) V on the basdis of scores on a predictor (input) V (2 or more Vs used to predict scores on one criterion)

Results in Multiple R
Multiple R
can be squared, called coefficient of multiple determination (proportion of variance in criterion V accounted for by combination of predictor Vs)
Stepwise regression
Goal is to come up with smallest set of predictors that maximizes predictive power
Canonical Correlation
Used to calculate relationship btwn 2/more predictors and 2/more criterion Vs
Discriminant Function Analysis
Used when goal is to classify individuals into groups based on their scores on multiple predictors
Partial Correlation
Used to assess relationship btwn 2 Vs with the effects of another V partialed out
Zero-order correlation
Correlation btwn 2 Vs determined without regard for any other Vs, converse of Partial Correlation
Structural Equation Modeling
Calculating pairwise correlations btwn multiple Vs, purpose is causal modeling, uses path analysis, LISREL
Normal curve
Could also be referred to as probability distribution
Z score
Obtained by subtracting sample mean score from an obtained score and dividing the result by the sample SD
Relationship of percentiles and z-scores
Constant difference between raw or z scores will be associated with variable differences in percentile scores, as a function of the distance of the two scores from the mean.

Closer to mean, more difference in z- or raw scores
T scores
Mean = 50, SD = 10
Extreme scores
Take care in interpreting: percentile is an extrapolation

Use estimated prevalence value to determine whether interpretation of extreme scores may be appropriate
Normalizing test scores
Best to add test content, rather than statistically transforming non-normal scores into a normal distribution
Consistency of a measurement of a given test, includes
Internal consistency
Test-Retest reliability
Alternate form reliability
Interrater reliability

A reliability coefficient is interpreted directly as the percent of variance accounted for (don't square it)
Kuder-Richardson reliability coefficient
Measure of internal reliability of a test;for items with yes/no answers or heterogeneous tests where split half methods must be used
Cronbach's Alpha coeffiecient
Measure of internal consistency (average intercorrelation between test items), used for tests with items that yield more than two response types
Adequacy of reliability coefficients
.90+ very high
.80-.89 High
.70-.79 Adequate
.60-.69 Marginal
<.59 Low
Spearman Brown
Can be used to estimate the effects of lengthening or shortening a test on its relability coefficient
Standard Error of Measurement
Index of error in measurement
Estimation of confidence interval around obtained score
Lower standard deviation and higher reliability, the lower the SEM
Standard Error of Estimate
Estimation of confidence interval around estimated true scores
Does test measure what it was intended to measure
Face validity
Extent to which test appears to measure what it is supposed to measure; could affect test taker motivation
Content related validity
Systematic evaluation of test by experts - relevance, representativeness
Convergent and Discriminant Validity (Construct)
Correlate with tests of similar and dissimilar constructs
Concurrent and Predictive Valdity (Criterion)
Concurrent used to identify existing diagnoses or conditions
Predictive used to determine whether a test predicts future outcomes
Size of validity coefficient
Rarely exceeds .3 or .4
Proportion of test-takers with positive attribute who are positively identified by the test
Proportion of test-takers with negative attribute who are correctly identified by the test
Positive Likelihood Ratio
Combines sensitivity and specificity into a single index of overall test accuracy indicating the odds that a positive test results has come from a positive examinee
Positive Predictive Power (PPP)
probability that an individual with a positive test result has the condition of interest
Negative Predictive Power (NPP)
Probability that an individual with a negative test result does not have the condition of interest
Bayesian statistics
Application of methods for deriving predictive power and othe related indices of confidence in decision making
Reliable Change Index (RCI)
Indicator of the probability that an observed difference between two scores from the same examinee on the same test can be attributed to measurement error (i.e. to imperfect reliability)
Standard Error of Difference
Standard deviation of expected test-retest difference scores about a mean of 0 given an assumption that no actual change has occured
Relationship of Reliability and Vaidity
Reliability places a ceiling on validity
High reliability does not guarantee validity
Reliability is necessary but not sufficient for validity
Limitation of age and grade equivalents
Highly sensitive to minor changes in raw scores
Does not represent equal intervals
Do not uniformly correspond to norm referenced scores