• Shuffle
Toggle On
Toggle Off
• Alphabetize
Toggle On
Toggle Off
• Front First
Toggle On
Toggle Off
• Both Sides
Toggle On
Toggle Off
Toggle On
Toggle Off
Front

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

Play button

Play button

Progress

1/74

Click to flip

74 Cards in this Set

• Front
• Back
 Internal Validity Permits conclusion that there is a relationship between IV and DV Threats to Internal Validity History (external events) Maturation Boredom Previous testing Regression toward the mean Experimenter Expectancy Best way to increase internal validity Random assignment External Validity Permits generalizability of results Threats to External Validity Interaction between selection and treatment (rx doesn't generalize to same pop) Interaction between testing and rx (rx only works if there's a pretest) History Hawthorne Effect (tendency of Ss to beh differently when being observed) Order effects (repeated measures studies) Types of Designs True experimental (Ss randomly assigned to IV) Quasi-experimental (Ss not randomly assigned) Correlational (Vs not manipulated and no causal relationship assumed) Developmental research (assessing Vs as a fx of dev over time, longitudinal Types of Designs cont. Time Series (DV measured several times at regular before and after rx is administered) Single subject (ABA, ABAB) Qualitative (descriptive) Scales of measurement Nominal (unordered categories; gender) Ordinal (ordered, rank) Interval (successive but no absolute 0; IQ) Ratio (has absolute 0; weight, time) Parametric Stats Assumptions Normal distribution, homogeneity of variance, independence of observations T-test Parametric stat, comparison of 2 means One-way ANOVA Parametric stat, one DV and more than two groups, yields an F value, determining whether population means differed Post hoc tests Used in one-way ANOVA, ex. Tukey and Scheffe, pinpoints exact pattern of differences of among means b/c F doesn't do this Scheffe most conservative of post-hoc tests, minimizes Type I error but highest Type II error Type I error Finding a difference when there isn't one Type II error Finding no difference when there is one Factorial ANOVA 2 or more IV and one DV Can't interpret main effects when there's an interaction MANOVA multiple DVs and at least one IV Nonparametric Stats For nominal or ordinal data, distribution free, less powerful, includes Chi Square, Mann Whitney U, Wilcoxon Matched Pairs, Kruskal Wallis Chi Square Used to analyze nominal data (compares observed frequencies of observations within nominal categories to freqs that would be expected under the null) Cautions: all observations must be independent (no before/after study) Mann Whitney Compare two indep grps on a DV measured with rank ordered data Alternative to t-testfor independent samples if nonparametric data Wilcoxon Matched Pairs compare 2 correlated gps on DV measured with rank ordered data Alternative to t-test for correlated samples if non-parametric Kruskal Wallis Compares 2/more indep grps on a DV with rank ordered data Alternative to one way ANOVA Negative skew Most scores are high (to the right), but a few extreme low scores. Mean is lower than median, median lower than mode Means easy test, ceiling effects Positive skew Most scores are low (to the left), but a few extreme high scores Mean is higher than the median, median higher than the mode Difficult test; floor effects Variance Average of sq differences of each observation from the mean Standard Deviation Sq Rt of the variance Stanine Divide distribution into 9 = intervals, with 1 lowest and 9 highest Standard Error of the Mean Provides index of expected inaccuracy of sample mean Statistical Decision Making Four Possibilities: 1) True null retained (correct, no difference between IV) 2) True null rejected (incorrect; Type I error; say difference when isn't) 3) False null rejected (correct; find difference that does occur) 4) False null retained (incorrect; Type II error; there is a difference) One-tailed test Predict that direction of means differ Two Tailed test Don't predict direction of difference Power Probability of rejecting null H when it's false (probability of not making a Type II error) Increases by largr N, 1 tailed test Pearson r correlation between two continuous variables Square Pearson r Percentage of variability i one measure that is accounted for by variability in other measure (coefficient of determination) Point-biserial coefficient correlates one continuous V with one dichotomous V Phi coefficient corerlates 2 dichotomied Vs Spearman's rho correlates 2 rank ordered Vs Regression When 2 Vs correlated, constructs equation to est the value of a criterion (outcome) V on the basdis of scores on a predictor (input) V (2 or more Vs used to predict scores on one criterion) Results in Multiple R Multiple R can be squared, called coefficient of multiple determination (proportion of variance in criterion V accounted for by combination of predictor Vs) Stepwise regression Goal is to come up with smallest set of predictors that maximizes predictive power Canonical Correlation Used to calculate relationship btwn 2/more predictors and 2/more criterion Vs Discriminant Function Analysis Used when goal is to classify individuals into groups based on their scores on multiple predictors Partial Correlation Used to assess relationship btwn 2 Vs with the effects of another V partialed out Zero-order correlation Correlation btwn 2 Vs determined without regard for any other Vs, converse of Partial Correlation Structural Equation Modeling Calculating pairwise correlations btwn multiple Vs, purpose is causal modeling, uses path analysis, LISREL Normal curve Could also be referred to as probability distribution Z score Obtained by subtracting sample mean score from an obtained score and dividing the result by the sample SD Relationship of percentiles and z-scores Constant difference between raw or z scores will be associated with variable differences in percentile scores, as a function of the distance of the two scores from the mean. Closer to mean, more difference in z- or raw scores T scores Mean = 50, SD = 10 Extreme scores Take care in interpreting: percentile is an extrapolation Use estimated prevalence value to determine whether interpretation of extreme scores may be appropriate Normalizing test scores Best to add test content, rather than statistically transforming non-normal scores into a normal distribution Reliability Consistency of a measurement of a given test, includes Internal consistency Test-Retest reliability Alternate form reliability Interrater reliability A reliability coefficient is interpreted directly as the percent of variance accounted for (don't square it) Kuder-Richardson reliability coefficient Measure of internal reliability of a test;for items with yes/no answers or heterogeneous tests where split half methods must be used Cronbach's Alpha coeffiecient Measure of internal consistency (average intercorrelation between test items), used for tests with items that yield more than two response types Adequacy of reliability coefficients .90+ very high .80-.89 High .70-.79 Adequate .60-.69 Marginal <.59 Low Spearman Brown Can be used to estimate the effects of lengthening or shortening a test on its relability coefficient Standard Error of Measurement Index of error in measurement Estimation of confidence interval around obtained score Lower standard deviation and higher reliability, the lower the SEM Standard Error of Estimate Estimation of confidence interval around estimated true scores Validity Does test measure what it was intended to measure Face validity Extent to which test appears to measure what it is supposed to measure; could affect test taker motivation Content related validity Systematic evaluation of test by experts - relevance, representativeness Convergent and Discriminant Validity (Construct) Correlate with tests of similar and dissimilar constructs Concurrent and Predictive Valdity (Criterion) Concurrent used to identify existing diagnoses or conditions Predictive used to determine whether a test predicts future outcomes Size of validity coefficient Rarely exceeds .3 or .4 Sensitivity Proportion of test-takers with positive attribute who are positively identified by the test Specificity Proportion of test-takers with negative attribute who are correctly identified by the test Positive Likelihood Ratio Combines sensitivity and specificity into a single index of overall test accuracy indicating the odds that a positive test results has come from a positive examinee Positive Predictive Power (PPP) probability that an individual with a positive test result has the condition of interest Negative Predictive Power (NPP) Probability that an individual with a negative test result does not have the condition of interest Bayesian statistics Application of methods for deriving predictive power and othe related indices of confidence in decision making Reliable Change Index (RCI) Indicator of the probability that an observed difference between two scores from the same examinee on the same test can be attributed to measurement error (i.e. to imperfect reliability) Standard Error of Difference Standard deviation of expected test-retest difference scores about a mean of 0 given an assumption that no actual change has occured Relationship of Reliability and Vaidity Reliability places a ceiling on validity High reliability does not guarantee validity Reliability is necessary but not sufficient for validity Limitation of age and grade equivalents Highly sensitive to minor changes in raw scores Does not represent equal intervals Do not uniformly correspond to norm referenced scores