• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/106

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

106 Cards in this Set

  • Front
  • Back
What makes psychological research science?
The Scientific Method: Psychologists use the scientific method to conduct research. It is a standardized way of making observations, gathering data, forming theories, testing predictions and interpreting results
Describe the “hourglass” notion of doing research.
begin with broad questions
narrow down, focus in
operationalize
observe
analyze data
reach conclusions
generalize back to questions
Principle of Parsimony
Parsimony reflects an accepted principle or heuristic in science that guides our interpretations of data and phenomena of interest. It directs us to select the simplest version or account of the data among the available alternatives. It means that if competing views or interpretations of any phenomenon can be proposed, one adopts the simplest account.
Research Methodology for making Inferences
The purpose of research is to draw valid inferences about the relations between variables. Methodology consists of those practices that help to arrange the circumstances so as to minimize ambiguity in reaching these inferences.
Findings vs Conclusions
How we view and interpret findings dictates what conclusions we make. In most cases, findings are not unequivocally clear and research must accumulate for years to elaborate a phenomenon and to clarify the circumstances and contexts in which the effects are evident.
Sources of Ideas for Research
1. simple curiosity
2. extending external validity
3. studying mediators
4. studying special populations
5. case study
Research Questions
1. What is the relationship between (among) the variables of interest?
2. What factors influence the relationship between variables, that is, the direction or magnitude of the relation?
3. How does the phenomenon work, that is, through what relation or mechanism or through what process does A lead to B?
4. Can we control or alter the outcome of interest?
Correlate
Correlate: The two (or more) variables are associated at a given point in time in which there is no direct evidence that one variable precedes the other
Risk Factor
Risk Factor: A characteristic that is an antecedent to and increases the likelihood of an outcome of interest. A “correlate” in which the time sequence is established.
Cause
Cause: one variable influences, either directly or through other variables, the appearance of the outcome. Changing 1 variable is shown to lead to a change in another variable
Moderator
Moderator: a variable that influences the relationship of 2 variables of interest. The relationship between variables (A and B) changes or is different as a function of some other variable (sex, age, ethnicity)
Mediator
Mediator: The process, mechanism, or means through which a variable produces a particular outcome. Beyond knowing that A may cause B, the mechanism elaborates precisely what happens (psychologically or biologically) that explains how B results.
Intervention
Intervention: is there something we can do to decrease the likelihood that an undesired outcome will occur (prevention) or decrease or eliminate an undesired outcome that has already occurred (treatment)?
Relation of theory to research
Theory refers to a conceptualization of the phenomenon of interest and the focus can be broad or narrow. We need theories for 4 main reasons: 1. It can bring order to areas where findings are diffuse or multiple 2. Theory can explain the basis of change and unite diverse outcomes 3. Theory can direct our attention to which moderators to study 4. The best way to advance real world application of theories is understand how something operates. Theories allow researchers to begin their articles with a concept and then hypothesize and make predictions. Research serves to further support or refute theory so the two are inevitably interrelated.
Operationalization of concepts and settings
Whatever the original idea that provides the impetus for research, it must be described concretely so that it can be tested. Operational definitions refer to defining a concept on the basis of the specific operations used in the experiment.
. There is also the problem with consistency amongst research and settings on how concepts are defined.
What is a model and why are models needed for conducting research?
A model is any device used to represent something other than itself. A model is intended to be not a description of reality but only a representation of the features of reality that are essential for understanding a particular problem. Any model, from the most specific to the most general, is used to aid understanding. It provides a way of looking at the universe that makes it more easily comprehensible.
Proximal and Distal Causation
A proximate cause is an event which is closest to, or immediately responsible for causing, some observed result. This exists in contrast to a higher-level ultimate cause (or distal cause) which is usually thought of as the "real" reason something occurred.Example: Why did the ship sink?
Proximate cause: Because it was holed beneath the waterline, water entered the hull and the ship became denser than the water which supported it, so it couldn't stay afloat.
Ultimate cause: Because the ship hit a rock which tore open the hole in the ship's hull.
Threats to internal validity
• Ambiguous temporal precedence
• Confounding
• Selection bias
• History
• Maturation
• Repeated testing (also referred to as testing effects)
• Instrument change (instrumentality)
• Regression toward the mean
• Mortality/differential attrition
• Selection-maturation interaction
• Diffusion
• Compensatory rivalry/resentful demoralization
• Experimenter bias
Regression to the Mean
This type of error occurs when subjects are selected on the basis of extreme scores (one far away from the mean) during a test. For example, when children with the worst reading scores are selected to participate in a reading course, improvements at the end of the course might be due to regression toward the mean and not the course's effectiveness. If the children had been tested again before the course started, they would likely have obtained better scores anyway. Likewise, extreme outliers on individual scores are more likely to be captured in one instance of testing but will likely evolve into a more normal distribution with repeated testing.
Threats to external validity
o Sample Characteristics
o Stimulus Characteristics and Settings
o Reactivity of Experimental Arrangements
o Multiple-Treatment Interference
o Novelty Effects
o Reactivity of Assessment
o Test Sensitization
o Timing of Measurement
Nomological Network
Nomological network is a representation of the concepts (constructs) of interest in a study, their observable manifestations, and the interrelationships among and between these. It was Cronbach and Meehl's view of construct validity that in order to provide evidence that a measure has construct validity, a nomological network has to be developed for its measure.
Threats to Construct Validity
o Attention and Contact with Clients
o Single Operations an Narrow Stimulus Sampling
o Experimenter Expectancies
o Cues of the Experimental Situation
Threats to Statistical Conclusion Validity
o Low Statistical Power
o Variability in the Procedures
o Subject Heterogeneity
o Unreliability of the Measures
o Multiple Comparisons and Error Rates
Mook, 1983 In Defense of External Validity
• The external-validity question is a special case. It conies to this: Are the sample, the setting, and the manipulation so artificial that the class of "target" real-life situations to which the results can be generalized is likely to be trivially small? If so, the experiment lacks external validity. But that argument still begs the question I wish to raise here: Is such generalization our intent? Is it what we want to do? Not always.
• Harlow's rhesus mokeys
GxE
The recognition that both genes (nature) and environments (nurture) are important for understanding the etiology of depression has led to a rapid growth in research exploring gene-environment interactions (GxE).
What are the two disciplines of psychology and why does Cronbach argue that they shouldn’t be separate?a.
Experimental and Correlational psychology: Cronbach argued that psychology continues to this day to be limited by the dedication of its investigators to one or the other method of inquiry rather than to scientific psychology as a whole. He describes the essential features of each approach to asking questions about human nature and he strongly hints at the benefits to be gained by unification. Put simply, Cronbach sees this as a puppet show where the experimentalist manipulates the puppets to arrive at a successful outcome while the correlationist watches the interaction of the puppets as he would people, to see how environment, social elements and the like affects them. Cronbach is proposing a coming together of these two strands of psychology to compliment each other and arrive at a more complete solution.
Why “race” is not a useful variable for analysis?
“because race lacks precise meaning, various psychologists have long challenged the scientific merit of studying or using race as an explanatory construct in psychological theory, research, and, by implication, practice” “race itself has no shared conceptual definition by tacitly agreeing to use factitious racial categories (e.g., Black and White) as independent or predictor variables in their theories and research designs as if the categories convey whatever conceptual meaning of race the researcher intends.” (Helms et al, 2005)
What Helms et al. advocate as alternatives for studying group differences?
Four Strategies:
1. Substitute the concepts of ethnicity, ethnic group, or ethnic identity for race or racial group. By concepts, advocates of this approach mean specification of factors such as values, customs, or traditions rather than merely substituting alternative labels for race or racial group.
2. Avoid using racial categories in research designs without a clear conceptual reason for doing so because racial categories encompass such a wide array of unspecified attributes, it is too tempting to “fall into the trap of ‘explaining’ [racial category] differences [on the dependent variable]” by means of racial categories instead of identifying the variables associated
with racial categories (e.g., exposure to discrimination, in-group bias) that relate to or affect the dependent variables in research designs.
3. Replace racial categories as independent variables with independent variables derived from racial categorization (RC) theories. Unlike users of racial categories as independent variables, RC the
Elements of informed consent for research participation
• Competence: The individual’s ability to make well-reasoned decisions and give consent meaningfully.
• Knowledge: Understanding the nature of the experiment, the alternatives available and the potential risks and benefits.
• Volition: Agreement to participate on the part of the subject that is provided willingly and free from constraint or duress.
When it’s OK to waive informed consent?
• When archival data is used and the subjects are no longer living or cannot be identified.
• Where research would not reasonably be assumed to create distress or harm and involves (a) the study of normal educational practices, curricula, or classroom management methods conducted in educational settings; (b) only anonymous questionnaires, naturalistic observations, or archival research for which disclosure of responses would not place participants at risk of criminal or civil liability or damage their financial standing, employability, or reputation, and confidentiality is protected; or (c) the study of factors related to job or organization effectiveness conducted in organizational settings for which there is no risk to participants’ employability, and confidentiality is protected.
Definitions of Research and Human Subject
• RESEARCH is a systematic investigation, including research development, testing, and evaluation, designed to develop or contribute to generalizable knowledge.

• HUMAN SUBJECT is a living individual about whom an investigator (whether professional or student) conducting research obtains:
1. data through intervention or interaction with the individual, or
2. identifiable private information
Types of IRB Review
There are two categories of IRB review of proposed studies:
1. Expedited Review
2. Full Board Review
3. Exemptions from IRB Review
What sorts of procedural biases are most problematic for making inferences from study results?
Experimenter expectancy effects because the expectancies alone or in combination with the manipulation can be responsible for the pattern of results.
When is non-random sampling most problematic?
When you want to generalize to a population but because you did a non-random sample you don’t know how this population is different.
What is the difference between random sampling & random assignment?
Random sampling is when you randomly select participants for a study whereas random assignment is in an intervention study you may need to randomly assign participants to different treatment groups.
What’s the “third variable” problem and what is its relevance for validity?
A type of confounding in which a third variable leads to a mistaken causal relationship between two others. For instance, cities with a greater number of churches have a higher crime rate. However, more churches do not lead to more crime, but instead the third variable population leads to both more churches and more crime.
Sources of Bias
Rationales, Scripts and Procedures
Experimenter Expectancy Effects
Experimenter Characteristics
Situational and Contextual Cues
Subject Roles
Data Recording and Analysis
Subject-Selection Bias
The Sample: Who is selected for the study?
Attrition: Who remains in the study?
Subject Assignment and Group Formation
Random Assignment
Group Equivalence
Matching
Mismatching
3 components in the classical measurement model
True Score: Classical test theory assumes that each person has a true score,T, that would be obtained if there were no errors in measurement. A person's true score is defined as the expected number-correct score over an infinite number of independent administrations of the test. Unfortunately, test users never observe a person's true score, only an observed score, X. It is assumed that observed score = true score plus some error:
Observed Score: In classical measurement theory, observed score (O) is composed of ture score (T) and an error component (E).
Oi = T + Ei
Residual: Is what is leftover when you extract the True Score (T) from the Observed score (O), so the residual is the error (E).
What is method bias and what kind of validity is it most pertinent to?
Method Bias: research methods influence construct measurements and that this influence, or method bias. Common method bias refers to a bias in your dataset due to something external to the measures. In other words, the measured difference is due to the study itself (or something else), rather than the actuality of the situation. The bias can occur because of the way the questions are constructed, the way in which they're asked, the audience to which they're asked, etc. EX. Using a computer based test biases the research because it may eliminate certain participants who aren’t computer literate…
Internal Validity is threatened by method bias.
Types of Measures
1. Self Report
2. Projective Techniques
3. Observational
4. Physiological/biological
5. Archival
6. Indirect
Self-report
Strengths –
• Many states and feelings are defined by what the client says or feels and self-report is a direct assessment of this
• Permits assessment of several domains of functioning that are not available with other techniques
• Ease of administration
Weaknesses –
• Responses can be greatly influences by wording, format and order of appearance of the items
• Possibility of bias and distortion (alteration of responses in light of the participants own motives) of part of the subjects.
Projective Technique
Strengths –
• Reduced vulnerability to response sets and biases that might be evident on self-report.
Weaknesses –
• Rely upon interpretations and inferences of the examining psychologist and these have been shown to be inconsistent across examiners.
Observational
Strengths –
• Provides a unique focus that extends the method of evaluation beyond the more familiar and commonly used self-report.
• In the lab, one can evaluate interpreter agreement to ensure that the responses are adequately assessed.
• In the lab one can observe the condition and integrate other measures such as physiological monitoring.
Weaknesses –
• Not necessarily representative samples of what behaviors are like during unsampled times.
• Performance may change when individuals are aware they are being studied.
Physiological / biological
Strengths –
• There have been enormous advancements and there are measures that encompass many different types of functions.
• Eliminates socially desirable responding
Weaknesses –
• Remarkable differences in patterns of responding among subjects.
• Expensive equipment
• Movement and other restrictions such as no movement in the fMRI machine
Archival
Strengths –
• Provide a wealth of information about people
• Can be examined without fear that the experimenter’s hypothesis or actions of the observers may influence the raw data
Weaknesses –
• Changes in criteria for recording certain types of information leading to interpretative problems.
• Selectivity in the information that becomes archival
Indirect
Strengths –
• Can be valuable evidence
Weaknesses –
• Changes over time may occur as a function of the ability of certain traces to be left
• Selective deposits: may not represent the behavior of all the participants of interest.
Selecting Measures for Research: Key Considerations
Construct Validity
Psychometric Characteristics
Reliability
Validity
Sensitivity of the Measure
Construct Validity
Construct Validity: the extent to which the measure actually measures the construct or facet of the construct of interest.
Psychometric Characteristics
Psychometric Characteristics: reliability and validity evidence in behalf of a measure.
Reliability: consistency of a measure
Validity: content and whether the measure assesses the domain of interest
Reliability
consistency of a measure
Validity
content and whether the measure asses the domain of interest
Sensitivity of the Measure
Sensitivity of the Measure: the capacity of a measure to reflect systematic variation, change or differences in response to an intervention, experimental manipulation or difference group composition.
The Multitrait-Multimethod Matrix
The Multitrait-Multimethod Matrix is an approach to assessing the construct validity of a set of measures in a study. It was developed in 1959 by Campbell and Fiske. Convergent and discriminant validation by the multitrait-multimethod matrix in part as an attempt to provide a practical methodology that researchers could actually use (as opposed to thenomological network idea which was theoretically useful but did not include a methodology). Along with the MTMM, Campbell and Fiske introduced two new types of validity -- convergent and discriminant -- as subcategories of construct validity.
Types of reliability
Internal consistency
Inter-rater reliability
Test-retest/alternate forms
Internal consistency
Internal consistency: In statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test (or the same subscale on a larger test). It measures whether several items that propose to measure the same general construct produce similar scores. For example, if a respondent expressed agreement with the statements "I like to ride bicycles" and "I've enjoyed riding bicycles in the past", and disagreement with the statement "I hate bicycles", this would be indicative of good internal consistency of the test.
Inter-rater Reliability
Inter-rater: In statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by determining if a particular scale is appropriate for measuring a particular variable. If various raters do not agree, either the scale is defective or the raters need to be re-trained.
Test-retest Reliability
3. Test-retest/alternate forms: Repeatability or test-retest reliability[1] is the variation in measurements taken by a single person or instrument on the same item and under the same conditions. A less-than-perfect test-retest reliability causes test-retest variability. Such variability can be caused by, for example, intra-individual variability and intra-observer variability. A measurement may be said to be repeatable when this variation is smaller than some agreed limit.
Reliability Coefficients
KR20 (internal consistency)
Cronbach's alpha (internal consistency)
Kappa coefficient (inter-rater)
Pearson's r (test-retest)
Correlation Coefficient
Correlation coefficient: This is a measure of the direction (positive or negative) and extent (range of a correlation coefficient is from -1 to +1) of the relationship between two sets of scores. Scores with a positive correlation coefficient go up and down together (as with smoking and cancer). A negative correlation coefficient indicates that as one score increases, the other score decreases (as in the relationship between self-esteem and depression; as self-esteem increases, the rate of depression decreases).
Reliability Coefficient
Reliability coefficient: provides an index of the relative influence of true and error scores on attained test scores. In its general form, the reliability coefficient is defined as the ratio of true score variance to the total variance of test scores. Or, equivalently, one minus the ratio of the variation of the error score and the variation of the observed score
KR20
KR20: In statistics, the Kuder–Richardson Formula 20 (KR-20) first published in 1937[1] is a measure of internal consistency reliability for measures with dichotomous choices. It is analogous to Cronbach's α, except Cronbach's α is also used for non-dichotomous (continuous) measures.[2] A high KR-20 coefficient (e.g., > 0.90) indicates a homogeneous test.
Cronbach's alpha
Cronbach’s alpha: In statistics, Cronbach's (alpha)[1] is a coefficient of internal consistency. It is commonly used as an estimate of the reliability of a psychometric test for a sample of examinees.
Kappa coefficient
Kappa coefficient: Cohen's kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative (categorical) items. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance.
Peasron's r
Pearson's r: The correlation between scores on the first test and the scores on the retest is used to estimate the reliability of the test using the
Factor Score
Factor score: Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. A factor score is a numerical value that indicates a person's relative spacing or standing on a latent factor.
Composite Score
Composite score: calculated from data in multiple variables in order to form reliable and valid measures of latent, theoretical constructs. The variables which are combined to form a composite score should be related to one another. This can be tested through factor analysis and reliability analysis.
Composite scores are preferable when there are multiple variables.
Under what conditions would we not expect (or desire) high test-retest reliability?
We would not expect or desire high test-retest reliability when we are administering an intervention. We would want the test-retest scores to be different, showing that our intervention is working.
Measurement Levels
Measurement levels
o Ordinal
o Equal-interval
o Ratio
o Nominal
Why do Bartko & Carpenter advocate using kappa (rather than % agreement or a correlation coefficient) to index agreement between raters or tests?
Percent agreement ignores chance agreements whereas k is a chance-corrected, scaled reliability measure.
In what circumstances would it be preferable to use a weighted kappa instead of an unweighted kappa?
Kappa does not take into account the degree of disagreement between observers and all disagreement is treated equally as total disagreement. Therefore when the categories are ordered, it is preferable to use Weighted Kappa, and assign different weights wi to subjects for whom the raters differ by i categories, so that different levels of agreement can contribute to the value of Kappa.
ICC
Intraclass correlation coefficient, abbreviated ICC, is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other. While it is viewed as a type of correlation, unlike most other correlation measures it operates on data structured as groups, rather than data structured as paired observations.
What type of classification error would you try to minimize in a screening study. Why?
You would try to minimize False Negatives because you want to identify all possible persons with the condition so that they can be treated or included in the study. In the case of screening for suicidality it may cost someone’s life for a case to be classified falsely as a negative.
What is the base rate problem in using a test for classification?
Help!!!
Sensitivity
Sensitivity: measures the proportion of actual positives which are correctly identified as such (e.g. the percentage of sick people who are correctly identified as having the condition).
Specificity
Specificity: measures the proportion of negatives which are correctly identified (e.g. the percentage of healthy people who are correctly identified as not having the condition).
What are sensitivity and specificity?
Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function.
PPV
Positive Predictive Value: In statistics and diagnostic testing, the positive predictive value, or precision rate is the proportion of positive test results that are true positives (such as correct diagnoses). It is a critical measure of the performance of a diagnostic method, as it reflects the probability that a positive test reflects the underlying condition being tested for. Its value does however depend on the prevalence of the outcome of interest, which may be unknown for a particular target population.
NPV
Negative Predictive Value: In statistics and diagnostic testing, the negative predictive value (NPV) is a summary statistic used to describe the performance of a diagnostic testing procedure. It is defined as the proportion of subjects with a negative test result who are correctly diagnosed. A high NPV for a given test means that when the test yields a negative result, it is most likely correct in its assessment. In the familiar context of medical testing, a high NPV means that the test only rarely misclassifies a sick person as being healthy.
ROC Curve
ROC curve: is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the positives (TPR = true positive rate) vs. the fraction of false positives out of the negatives (FPR = false positive rate), at various threshold settings. TPR is also known as sensitivity, and FPR is one minus the specificity or true negative rate.
Cutting Scores
Cutting scores: are points at which a continuum of scores may be divided into groups. This might indicate pass/fail or be used to arrive at diagnostic groups as in the case of the levels of mental retardation linked to scores on intelligence measures and so aid in test interpretation.
Dawes, A Note on Base Rate and Psychometric Efficiency
• the clinical efficiency of a test depends upon the proportions of the test subjects belonging to the categories the test is attempting to differentiate.
• Bayes Theorem equation
What are the ways qualitative methods are useful for developing hypotheses?
It gives you a breadth of empirical data that you can formulate hypotheses with that you don’t get with quantitative data.
What kinds of information are collected better through qualitative methods than quantitative?
Open ended responses
What are some advantages of real time assessment compared to traditional self-report measures (e.g., a paper & pencil task asking about past-week behavior)?
In traditional self-report there is socially desirable responding.
What is Diary Method?
Diary: any data collection strategy which entails getting respondents to provide information linked to a temporal framework. It allows the medium of the record to be chosen so as to best suit the topic and the type of respondent studied.
What is Narrative Method?
Narrative: an individual’s story or account of their experience of events or people from the present or the past. It may have a disjointed temporal frame.
Item Curve
Item Curve: An item characteristic curve (ICC) displays the probability of a right mark, or the portion of each group of students, with the same ability, to make a right mark. The raw scores, ranging from 15 to 24, are from the Total Score column in Winsteps Table 17.1, Person Statistics.
Item Response Theory
Item Response Theory: (IRT) also known as strong true score theory, is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. Unlike simpler alternatives for creating scales as the simple sum questionnaire responses it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, the assumption in Likert scaling that "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments"
Factor Analysis
Factor Analysis: Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. In other words, it is possible, for example, that variations in three or four observed variables mainly reflect the variations in fewer unobserved variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modeled as linear combinations of the potential factors, plus "error" terms. The information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a dataset. Computationally this technique is equivalent to low rank approximation of the matrix of observed variables.
Two Types of Factor Analysis
1. Confirmatory
2. Exploratory
How is factor analysis used for scale development (a general idea, not technical details)?
Factor analysis can be useful for validating both unidimensional and multidimensional scales. For unidimensional scales, factor analysis can be used to explore possible subdimensions within the group of items selected. With multidimensional scales, factor analysis can be used to verify that the items empirically form the intended subscales.
Attenuation
Attenuation: reliability places an upper limit on the observed associations among measures.
Correction for attenuation is a statistical procedure, due to Spearman (1904), to "rid a correlation coefficient from the weakening effect of measurement error" (Jensen, 1998), a phenomenon also known as regression dilution. The correlation between two sets of parameters or measurements is estimated in a manner that accounts for measurement error contained within the estimates of those parameters.
What are 3 ways to improve reliability of measures and how do these work?
1. Including more items:¬
2. Including more categories within an item:
3. Statistically by using measurement models:
External criterion methods
Requires you to specify one or preferably more external variables that theory or practice says should be substantively related to the scale’s construct.
Item selection is the key question in scale construction by external criterion methods. The three most-used methods of item selection are:
1. the group difference method
2. the item validity method
3. the multiple regression method
Guttman Scaling
Guttman scaling: purpose of Guttman scaling is to establish a one-dimensional continuum for a concept you wish to measure. Essentially, we would like a set of items or statements so that a respondent who agrees with any specific question in the list will also agree with all previous questions. Put more formally, we would like to be able to predict item responses perfectly knowing only the total score for the respondent.
Q-Sort Methods
Q-sort: a psychological test requiring subjects to sort items relative to one another along a dimension such as ``agree''/``disagree'' for analysis by Q-methodological statistics.
Schwarz (1999): How does context influence responses to measures?
People have preconceived expectations about what kind of answers are wanted in responses. They will answer based on what they think you mean. EX. If asked how many times you were angered in the last week, respondents will give lots of detailed examples of every time they were frustrated or angry. If asked how many times angry in the last year, respondent assumes you mean severe cases and gives limited accounts about severe instances of feeling angry.
What is measurement invariance and why is it important?
Measurement invariance or measurement equivalence is a statistical property of measurement that indicates that the same construct is being measured across some specified groups. For example, measurement invariance can be used to study whether a given measure is interpreted in a conceptually similar manner by respondents representing different genders or cultural backgrounds. Violations of measurement invariance may preclude meaningful interpretation of measurement data. Tests of measurement invariance are increasingly used in fields such as psychology to supplement evaluation of measurement quality rooted in classical test theory.
Types of Measurement Invariance
1. Configural: is met when the items exhibit the same basic pattern of salient and nonsalient loadings across groups studied.
2. Scalar: deals with item intercepts
3. Metric: is met when, in addition to configural invariance, loadings are non-significantly different across groups
For comparing groups measurement invariance is important so...
so that differences in the outcomes are actually due to differences between the groups and not differences in the measurement.
For studies of development measurement invariance is important because...
because you want to know that progresses or impairments in development account for what is being measured and not variations in the measurement instrument.
How is factor analysis used in testing invariance?
The relationship between observed variables and hypothesized underlying constructs can be modeled using confirmatory factor analysis (CFA). It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct (or factor). As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. In confirmatory factor analysis, the researcher first develops a hypothesis about what factors s/he believes are underlying the measures s/he has used (e.g., "Depression" being the factor underlying the Beck Depression Inventory and the Hamilton Rating Scale for Depression) and may impose constraints on the model based on these a priori hypotheses. By imposing these constraints, the researcher is forcing the model to be consistent with his/her theory. For example, if it is posited that there are two factors accounting for the covariance in the measures, and that these factors are unrelated to one another, the
Advantages & disadvantages of online data collection
The advantages: Easy and available, affordable
The disadvantages: Limited to what is online; not customized.
What are some benefits of using Facebook as a source of research data
1. Plethora of readily available information
2. Studies show that most profiles are accurate representations of a person