• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/46

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

46 Cards in this Set

  • Front
  • Back
  • 3rd side (hint)
When assesing a test's split half reliability, what effect will incrasing the heterogenity of the sample (S's) have?
Increase the reliability
What is eta squared
A measure of effect size describing the amount of variablity in the criterion accounted for by the predictor
What is Cohen's d
Cohen's d is a measure of effect size when comparing the mean difference between two groups
ANOVA:regression as Cohen's d:_________
eta squared
Professor Sharp argues against raising the legal drinking age from 18 to 21 on the ground that doing so will only encourage 18-, 19-, and 20-year olds to drink. Apparently, Sharp is familiar with which of the following theories:


individuation
psychological reactance
inoculation
expectancy-value
reactance
How do you calculate a CI for a S's score on a test
to calculate CI you need
S's score
SEM of the test
level of confidence desired

e.g. for a test w/ SEM of 5, if a S scored 105 and you wnated a 68% CI:
68% CI = within 1 SEM
105 +/- 5
68% CI = 100-110

for a 95 % CI (2 SEM)
105 +/- (2x5)
105 +/- 10
95 % CI = 95-115

99% CI (3 SEM)
105 +/- (3x5)
105 +/- 15
99% CI = 90-120
What is the formuula for SEM?
sigma/sqrt N
What formula is

sigma [1-sqrt(1-r2xy)

where sigma=sd of criterion (Y) scores
and
rxy= correlation coefficient (an estimate of criterion validity)
Standard Error of Estimate

It is used to create CI's for estimates of performance on a criterion (Y variable) derived from a regression equation involving the predictor (X variable) score

CI = Yhat +/- SEe(# of SEe from mean assoc. w/ desired CI)

68% CI = 1 SEE
95% CI = 2 SEE
99% CI = 3 SEE
What is criterion related validity
Criterion Related Validity is the extent to which a predictor (X) reliably estimates performance on a criterion (Y) using a regression equation
What is construct validity

How is it assessed?
Construct Validity refers to the extent to which a predictor measures the concept it is deisned to.

Assess it using facor analysis or multitrait/multimethod analysis
What are the 3 types of Construct validity?
1. Discriminant validity
2. Convergent validity

Assess it using
1. Factor Analysis
2. Multitrait Multimethod Matrix
Constuct Validity = DC FM

Construction of the Antena towers on Wisconsin DC recpeption of FM radio stations
What are the 2 types of Criterion Validity

How do you assess them
predicitive and concurrent

Predictive validity is the assessed by calculating the Standard Error of Estimate (SEest)

SEest=SDy[sqrt(1-r2xy)]

where SDy = SD of criterion scores (Y scores)
and rxy2=validity coefficient (correlational coefficient indicating how much of the variance in Y is predicted by variability in X)
Crtierion Validity is PC
Reliability of a test= sum of what two things
Communality + specificity

where communality= % of variance in DV (Y) explained by all of the IV (X)

where Yhat=X put into the regression equation

specificity = 1-error
What is Content Validity?

For what kind of tests is it important
Content validity is important for:
1. Mastery tests (how much does S know about a specific Domain - like the EPPP)
2. Achievement tests (same as mastery?)
3. Developing tests of the KSA needed for a particular job (Use Job Analysis to determine the KSA)

4. job samples

It can be evaluated by asking "experts" to examine the test and see if it seems representative of the knowledge in the domain.

AND by testing to see:

Does it discriminate between novices and experts in the field
Content Validity =
1. Discrimination between experts and novices
2.Experts evaluation of test, 3. Correltion w/ established measures,

Content Validity=DEC
What are the 3 mnemonics for teh 3 types of validity
Construct DCFM towers
Criterion is PC
Content = ED and content validity is often all that elementary and secondary mastery tests are about
Think about progression of my time at AU

Construct(ion) of DCFM towers on the way in

Criterion=PC (the criterion for being liked on college campuses is are you PC)

ED content = what was the content of my education at AU?
What is the nature of the relationship between a test's reliability and validity
A test's validity coefficient cannot exceed the square root of its reliability coefficient
Memorize it!

Write out the formula for this to help figure out difficultly worded questions!


Hints:
1. reliability (r2xx) is already squared

2. the relaionship involved involves taking the square root of something
In terms of percentile ranks, where do 1sd, 2sd and 3sd above the mean fall?
84
97.7
99
different from normal distribution
If the assumptions for a One-Way ANOVA are violated, what test could you use instead
Kruskal-Wallis
What are the 3 assumptions when using a parametric test?
1. NORMAL POPULATION DISTRIBUTION (not too serious as long as can compensate by using sample size >= 25-30)

HOMOSCEDASTICITY - variances of the two populations are equal - if you violate this assumption, it's usally not too serious)

3. INDEPENDENT OBSERVATIONS*
very serious - Incrases likelihood of Type I and Type II error

Independent Observations
What 2 assumptions do Parametric and Non-parametric tests share?
1. Random Selection from Population

2. Independent Observations
If you have ordinal data w/ 2 IV's what test do you use to see if there are stat. sig differences between the groups?
Mann-Whitney U
If you have Nominal Data, what test do you use to compare cell means?
Chi-Square
If the assumption or normality is violated for a one-way ANOVA, what non-parametric test could you use instead?
Kruskal-Wallis
Non-parametric test to compare 2 indpendent groups?
Mann-Whitney U
U looks like two lines of data if you eliminate the bottom
Non-parametric test to compare 2 dependent (repeated-measures) groups?
Wilcoxan Matched Pairs Signed Rank Test
What test would you substitute for an independent measures t-test if the assumption of normality were violated or the data were ordinal?
Mann-Whitney U
What test would you substitute for a repeated measures t-test (or RM ANOVA w/ 1 IV) if the data were Ordinal or if the assumption of normality was violated?
Wilcoxan Matched-pairs signed rank test
If you want to use 2 or more continuous predictor variables (IV's) to predict preformance on nominal criterion (e.g. Yes/No or DSM-IV TR Dx.) what test should you use?
Discriminant Function Analysis
If the relationship between 2 or more discreet IV's (not contiuous) and a continuous Y variable, what type of test should you use?
Logistic Regression
If you want to test a pre-defined theory about causal relationship(s) between two vairables, and the relationship goes in one way (X causes Y, but not vice versa), what test (type of modling) should you use?
Path Analysis
Think of Life Path - it goes only one way
For what is the Spearman-Brown formula used?
correction for split-half reliability calculation
Split half - Spearman Brown - alliteravie,
rhythm of saying them together
If you are assessing inter-item consistency (reliability), and variable is continuous, you would use:
Cronbach's alpha
e.g. score on MOCI for MA Thesis
If you are assessing inter-item consistency and the variable is dichotomous (e.g. yes/no), what would you use to correct for the overestimaton you would expect using Cronbach's alpha?:
Kuder Richardson-20
Kuder-Richardson 20 has a 2 in it and the DV on this kind of test has only 2 possible answers
To assess inter-rater reliability, what two things could you use?
1. simple percent agreement
2. Kappa statistic
How can you increase a test's reliability?
1. Increase Homogenity of test-items

2. Increase Heterogeneity of Subjects taking test

3. Increase test length
What are the three types of Validity
CRITERION (use regression, Standard Error of Estimate)
1. Predicdtive
2. Concurrent

CONTENT (Ask an expert what s/he thinks)
1. Expert's opinion
2. Does it discriminate between experts and novices in the field?


CONSTRUCT
Two Types
1. Convergent
2. Discriminant
Two Ways to Assess
1. Factor Analysis
2. Multi-trait, Multi-method
Three C's w/ two types of validity each




1. Content (expers' opinion)
2. Construct
convergent validity, discriminant validity
factor analysis; multi-trait, multimethod
3. Criterion
Predictive Validity (regression, Standard Error of Estimate)
Concurrent Validity
Validity of a Predictor is limited by it's
Reliability

Validity <= (sqrt Reliability)
Think about the formula for calculating the Standard Error of Estimate (for Confidence Interval's when predicting score on Criterion from predictor using a regression equation)

SEest = SDy*[1-(sqrt r2xy)]
Validity of Predictor is less than or equal to the sqrt

and is less than or = to:
Validity of Predictor (rxy) is less than or equal to the square root of it's reliability of the predictor
rxy <= sqrt(r2xx)

and is less than or equal to the square root of the product of the reliability of the predictor times the reliability of the criterion
rxy <= sqrt(r2xx*r2yy)
If you want to calculate the correlation between two variables which are both continuous, which of the following would you use?

Biserial
Point Biserial
Rho
Pearson r
Pearson r
If you want to calculate the correlation between 2 variables and one is continuous and the other is a true dichtomoy, what test would you use?
Point Biserial
If something is pointed, it is specific
If you want to calculate the relationship between two rank-ordered variables, which of the following would you use?

Pearson r
Biserial
Point Biserial
rho
rho
If you want to calculate the relationship between two variables, one of which is continuous and the other of which is an artificial dichotomy
Bserial
In Item Response Theory, when one creates an Item Characteristic Curve (ICC), what is on the X-axis and what is on the Y-axis
X-axis: S's ability level (low to high)

Y-axis: % of S's who get item right
When using ICC, what does the slope indicate?
Disciminative ability for S's high and low in the ability being assessed (sharp slope discriminates well, flat slope does not)
When using ICC, how do you know what the chance of guessing correctly is?
Y-intercept
When using ICC, how do you determine how difficult an item is?
at what level of ability (where on the X-axis) do 50% of S's get the item right