• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/337

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

337 Cards in this Set

  • Front
  • Back

Census VS Survey

A census enumerates the whole population. A survey evaluates samples (subsets) of population instead of the whole population.

What is a survey?

A system for collecting information to describe, compare, or explain knowledge, attitudes, and behavior.

How do surveys collect this information?

By asking individuals questions to generate statistics on the group(s) that those individuals represent.

What is the main purpose of a survey?

To produce statistics.

What is the main way of collecting survey information?

By asking people questions.

From where do surveys collect information from

A fraction of a population- a sample, instead of a whole population

Areas that use surveys

Political research, sociological research, epidemiologic research, psychological research, marketing research

How many survey activities of a survey system are there? What are they?

1. Setting objectives for information collection


2. Designing the study


3. Preparing a reliable and valid instrument


4. Administering the survey.


5. Managing survey data


6. Analyzing survey data


7. Reporting the results

What is a research question?

Conceived prior to the study. Tend to be bases on more extensive background literature and will lead to more refined efforts at data collection

What is a hypothesis?

A suggested explanation of a phenomenon or reasoned proposal suggesting a possible correlation between multiple phenomena.

How do we form research questions?

1. Always begin with a literature review


2. Formulate a hypothesis


3. Can you test the hypothesis with a survey?


4. Is the question relevant, important, timely?

What's the most important characteristic of a research question?

specific

What to include to make a research question specific?

Purpose, specific objective, target population, survey population?

Three common ways of stating hypotheses

1. Positive declaration


2. Negative declaration


3. Implicit question

Positive Declaration

Research hypothesis


-The infant mortality rare is higher in one region than another


Negative Declaration

Null Hypothesis


-There is no difference between the infant mortality rates of two regions

Implicit question

To study the association between infant mortality and geographic region of residence


What are the methods of hypothesis formulation?

1. Method of difference


2. Method of Agreement


3. Method of concomitant variation


Method of difference

Recognizing the frequency of something is different in two sets of circumstances


Method of Agreement

Single factor is common to number of circumstances in which disease occurs with high frequency

Method of concomitant variation

Frequency of factor varies in proportion to frequency of disease

What terms must hypotheses must be put into

Operational

How are hypotheses tested?

In the form of experiments and tests.

Are hypotheses completely confirmed or rejected often?

No

Types of study designs

Observational


-Cross sectional


-retrospective (non-concurrent)


-prospective (concurrent)


-Pre/post-test



Experimental/Quasi-experimental

Summary of survey/research questions

1. What are you interested in finding out


2. Whom do you want to study


3. Where are these people or organizations located


4. When do you want to do the survey


5. What do you expect to learn and why?

What consists of the survey check list?

1. Set objectives for information collection


2. Design research


3. prepare ad reliable and valid data collection instrument


4. analyze data


5. report the results

National Health Interview Survey (NHIS)

Multipurpose health survey conducted by the NCHS, CDC and is the principal source of information on the health of the civilian, non-instutionalized, household pop of the US



Started in 1957

DHS Surveys (demographic and health surveys)

-large sample sizes, representative


-provide data for a wide range of monitoring and impact evaluation indicators in the areas of population, health and nutrition.


-every 5 years


-standard core questionnaire and other questionnaire (women's questionnaire)

Population council

Leader in conducting studies on a range of reproductive health issues


-horizons AIDSQuest

NHANES (National health and nutrition examination survey)

-conducted by the center for health statistics/CDC


-collect info about the health and diet of people in the US


-home interview and mobile exam center

Why sample?

Time, cost, quality/accuracy

Three general objectives of surveys

-Description (distribution of traits or attributes)


-Comparison and Explanation (multivariate analysis)


-Exploration (raise ideas, not answer research questions, not describe pop)

Units of analysis

1. Individual, often


2. Aggregate (families, cities, nation) sometimes

Can a survey include more than one unit of analysis?

Yes

Ecologic Fallacy

Bias that may occur because an association observed between variables on an aggregate or group level does not necessarily represent the association that exists at an individual level

Element

unit about which data is collected and which provides the basis of analysis

Universe

theoretical and hypothetical aggregation of all elements as defined for a specific survey

Population

theoretically specified aggregation of survey elements (working and operationalized definition, includes careful specification)

survey population

aggregation of elements from which the survey sample is actually selected



universe/population

Sampling unit

elements or set of elements considered for selection



aka: primary sampling unit (PSU)


single stage sampling=elements


multi-stage=different levels of sampling units

sampling frame

actual list of sampling units from which sample is selected

observation unit

element or aggregation of elements from which information is collected

Often unit of analysis=

unit of observation



but not always

variables

set of mutually exclusive characteristics

parameter

summary description of variable in population

statistic

summer description of variable in sample

sampling error

estimate of degree of random error to be expected from given sample design

Can you always ensure that all those in a population have a chance to be selected?

No ,hardly



Limit population to help with this

Probability sampling

-representative


-every member of the target population has a known non-zero probability of being included in the sample


-implies the use of random selection


-evaluate sample by the way it was selected

Three key issues of probability sample designs

-sample can only be representative of population includes in sample frame


-each person must have a known chance of selection


-no researcher discretion


-size and specific procedures of sampling design will influence precision of sample characteristics and correspondence to population

Quality of sampling directly associated with

quality of sampling frame

T/F Sampling frames cannot occur spontaneously.

False, but they should be documentable

The findings of the sample survey are only representative of

aggregation of elements that comprise the sampling frame

Sampling frames do not truly include all

elements their names imply

Elements should only appear

once

Types of samples

-simple random sampling: simple random sample, systematic samples


-stratified sampling: can be used with both SRS and MRS


-Multistage random sample (MRS): two stage or more, area probability samples, multistage cluster

Simple random sampling

-basic sampling method assumed in stat analysis


-assign each element a unique number, then use random number table or generator to select sample


-each element is sampled from the sampling frame one at a time, independent of one another and without replacement

Issues in simple random sampling

1. usually not possible


2. laborious process


3. due to natural variability, may end up with non-representative sample


4. Less precision than desired may result

Systematic Sampling

-commonly used instead of simple random sample if sampling frame is available


-every kth element selected for inclusion


-random start to ensure no bias


sampling interval

standard distance between elements

sampling ratio

proportion of elements in population selected

How to get sample interview and ratio

Want n=1000


Have N=10,000



1000/10000=1/10


interval/k =10 (round down if not an integer)


ratio=1/10th

Issues in systematic sampling

-Periodicity


-if there is periodicity in list, need to randomly reorganize before starting


-easier to do than simple random sample


-known probability, no researcher discretion

Stratified random sampling

-Population is divided into subgroups and a sample is then randomly selected from each strata


-Appropriate numbers of elements are drawn from homogenous subsets of the population (heterogeneity between subsets)

Stratified sampling does not occur instead of simple random sample, but

in addition to

Both SRS and SS ensure

a degree of representativeness and permit estimate of error

Stratified sampling permits a greater degree of

representativeness, thus decreasing probable sample error

Stratified increases

representativeness, efficiency

Stratified sampling makes more heterogenous or homogenous sample

homogenous

Stratified sampling decreases

error and variability because it ensures appropriate numbers of elements from homogenous groups

Disproportionate sampling

-oversample stratum with variability to increase precision of estimate


-increases n for subpopulation without increasing total N


-important to weight data accordingly in analysis (or oversampled group could be counted more)

NHANES oversamples

African americans, mexican americans, low income white americans, 12-19 years. 60+years

When are other places/times to weight data

-corrections of errors in selection, non-response, proportionate sampling where initial figures were incorrect

NHANES sample weight

-assigned to each sample person


-is a measure of the number of people in the population represented by that sample person


-reflects unequal probability, nonresponse adjustment, and adjustment to independent population controls

When do you sometimes choose Multistage Sampling

when no sampling frame of elements exists

The two steps of multi-stage sampling

-listing then sampling


-with stratification as desired at either or both levels of sampling

Issues with multistage sampling

-improved efficiency, but decreased accuracy


-subject to two(+) sampling errors


-issue of representativeness on two work levels

A cluster is

a naturally occurring unit

Sampling error is reduced by two factors

-increased sample size


-increase homogeneity of elements being sampled

A sample of clusters will best represent all clusters if

a large number are selected and if all clusters are similar

A sample of elements will best represent all elements in a given cluster if

large number of elements are selected and if all elements are similar

Clusters are less representative than

a simple random sample

Elements with in clusters are usually more homogeneous than

than all elements of the population

If you increase the number of clusters you must

decrease the number of elements

Area probability sampling

-type of multi-stage sampling strategy


-divide land area into exhaustive mutually exclusive sub-areas


-key: random selection

Random Digit Dialing

-version of multi-stage sampling


-relatively easy


-inexpensive


-large number of unfruitful calls


-excludes those without phones


-may be difficult to ascertain geographic area


-reliance on cell phones as main phone


-may randomly select the four-digit suffixes

For RDD you delineate...



Then you...

geographic boundaries of sampling area



identify all exchanges used in geographic area and distribution of prefixes within area

For RDD you may stratify based on

the distribution of prefixes

RDD provides a nonzero

chance of reaching any household within a sampling area that has a telephone line regardless of whether the number is listed


Is the probability of reaching every household equal for RDD?

No, households with more phones have greater probability


-adjust for unequal probability by weighting

Non-probability sampling

sampling in which some members of the eligible target population have a chance of being chosen for participation in the survey and others are not

Interviewer discretion/or respondent characteristics

affect the likelihood of being included in a sample

Accidental NPS

whatever cases happen to be available

Purposive NPA

planned selection of specific types of cases

When to do NPS

1. When probability sampling is too expensive


2. When purposive or judgmental sampling is better, so can create own sample to meet needs of study


3. hard to reach populations


4. pilot studies


5. surveys of specific groups

Common methods of non-probability sampling

1. convenience


2. most similar/dissimilar cases


3. typical cases


4. critical cases


5. snowball


6. quota

Convenience NPS

select cases based on their availability for the study

Most similar/dissimilar cases NPS

select cases that are judged to represent similar conditions or, alternatively, very different conditions

Typical Cases NPS

select cases that are known beforehand to be useful and not to be extreme


Critical Cases NPS

select cases that are key or essential for overall acceptance or assessment

Snowball NPS

group members identify additional members to be included in sample

Quota NPS

interviewers select sample that yields the same proportions as the population proportions on easily identified variables

Quota sampling the population is divided into... and the numbers of

subgroups



subgroup members in the sample are proportional to the numbers in the larger population

What do you need to know for quota sampling?

Population porportions

Quota sampling is done in effort to

reduce/minimize bias


Quota sampling ensures

proportionate representation of various categories or respondents

In quota sampling, sample elements can be weighted so...

the data provides a reasonable representation of total population

Problems with NPS designs

1. not necessarily representative of population


2. cannot necessarily apply findings to population


3. need to test sample and population characteristics to see how same or differe


4. May be useful for what you are doing


5. know limiations

Additional NP techniques

-focus groups: group asked opinions about a product/service/concept


-key informant interviews: qualitative in-depth interview with people who have knowledge or expertise in the field of interest, community, etc.

Advantages of Stratified VS SRS

1. Homogeneity within each stratum – smaller variation within strata produces stratified sampling estimators with smaller variance than SRS estimators – for the same sample size.


2. Separate estimators for population parameters can be obtained for each stratum


3. Cost can be less for STRS than SRS if strata represent locations.

Advantages of Cluster Sampling:

Feasibility: The only frames available are lists of clusters.



Economy: Listing costs and traveling costs tend to be lower for cluster sampling ( the field costs are lower).

Single -Stage Cluster Sampling

All listing units in the chosen clusters are selected

Two-Stage Cluster Sampling

First selecting a simple random sample of clusters and then selecting a simple random sample of listing units from each sampled cluster

Disadvantages of Cluster Sampling

Sample estimates based on cluster samples tend to have higher standard error than those based on other sampling plans, for the same sample size

3 factors that influence sample representativeness

sampling procedure


sample size


participation (response)

sampling error

variation around the true population value of samples by chance

Bias

systematic way the people responding to a survey are different from the target population as a whole

SE of a mean equation

se=√var/n

SE of proportions

Se=√p(1-P)/n

How is sampling error reduced

homogenous populations, large sample

Criteria for estimating survey sample size

1. identify major study variables


2. determine the types of estimates of study variables such as means or proportions


3. select the population or subgroups of interest


4a. indicate what you expect the population value to be


4b. estimate the SD


5. Decide on level of confidence


6. decide on tolerable range of error in estimate


7. compute sample size based on study assumptions

Formulas for precision

When sample size increases, what happens to variability and power?

They both increase

Does increasing the sample size reduce systematic error?

No

Non-sampling errors arise from:

poor definitions of the target population and non-response

Non-sampling errors can lead to:

bias, inability to interpret results

Non-sampling errors violate

the point of probability sampling

What is non-response?

-The failure to collect data from subjects selected to be in a sample


-leads to biased estimates (usually unsure of direction)


-potentially one of the most important sources of systematic error (bias) in a study

What are the 4 categories of nonresponders

1. Those whom the data collection procedures do not reach.


2. Those who refuse to provide data


3. Those who are unable to provide data


4. Item non-response

Response rate

number of respondents/number sampled x100

Minimum response rate is

75%



-adequate 50%


-good 60%


-very good 75%

If non response does not depend on outcome then is the chance of bias greater or less

less

bias due to non-response is a function of:

the response rate, the extent to which non-responders differ from the same population

As non response increases, the range of possible bias...

increases

Response rate in rural areas is ...than urban areas

higher

Response rate is _ when there is a designated respondent that if you take any responsible adult

lower

response rates are higher for topics that

interest people

Why are response rates lower in urban areas?

-Higher proportion of single individuals


-more high-rise dwellings (less accessible)


-interviewers may be uncomfortable in urban areas at night

Response rates frequently differ by

data collection method

Who is more likely to respond to mail surveys?

those with an interest in the topic and better educated people

Ways to maximize response rates in mail surveys

-make it attractive and professional


-professional endorsement


-pre-notification


-offer a monetary incentive


-quick, easy to ready and complete


-include stamped, addressed return envelopes


-reminders


-follow up non-responders by telephone


-ensure anonymity (where needed)

Confidential

 Keep name/personal information with access limited to specific people (or classes of people

Anonymous

No personal identification given – completely disconnected from person’s identity

How to max response in phone and personal interviews

1. vary call times, repeat unsuccessful calls


2. have flexible interview times


3. send informational letters ahead of time


4. communicate the survey purpose and importance to participant


5. use effective interviewers-training

How to correct for non-response

1. use proxy respondents


2. use statistical adjustments such as weighting subgroups to match their rate in the same population


3. survey non-responders

Wave analysis

1. compare those who responded in your first wave to those who responded only after follow-up


– Assume that late responders are similar to nonresponders; test this


2 Calculate the pattern of response in nonresponders needed to reverse the study conclusions; analyze worst/best case scenarios

When might you sample the entire population?

 When your population is very small


 When you have extensive resources


 When you don’t expect a very high response

A reliable survey is

consistent

A valid survey is

accurate

precisions is not always

necessary or desirable

Accuracy is a better reflection of the

real world

Reliability is a statistical measure of how

reproducible a survey instrument's data are

Lack of reliability may arise from divergences between

observers or instruments of measurement or instability of the attribute being measured

Measurement error

How well or poorly a particular instrument performs in a given population

How to maximize reliability

 Ask people only questions they’re likely to know answers to


 Ask about things relevant to them


 Be clear in what you’re asking

Reliability commonly assessed in three forms

– Test-retest reliability


– Alternate-form reliability


– Internal consistency reliability

Test-retest reliability

• Most common form in surveys


• Measured by having the same respondents complete a survey at two different points in time to see how stable the responses are


• Usually quantified with a correlation coefficient (r value)


• In general, r values are considered good if r  0.70 -- Varies by field

Major source of poor reliability

poor questions

To test-retest observer

have the same observer make two separate measurements or comparison between the two measurements is intraobserver reliability

For test-retest make sure its over_

a short period of time with items or scales that measure variables that are likely to change quickly

Potential problem with test-retest is the



and what does it do

practice effect – Individuals become familiar with the items and simply answer based on their memory of the last answer (practice effect)



can inflate reliability estimates

Alternate-form reliability

Questions or responses are reworded or their order is changed to produce two items that are similar but not identical



It is common to simply change the order of the response alternatives



use same level of difficulty


split-halves method

You can measure alternate-form reliability at the same timepoint or separate timepoints



If you have a large enough sample, you can split it in half and administer one item to each half and then compare the two halves

Internal consistency reliability

Applied not to one item, but to groups of items that are thought to measure different aspects of the same concept



• If internal consistency is low you can add more items or re-examine existing items for clarity

Cronbach’s coefficient alpha

– Measures internal consistency reliability among a group of items combined to form a single scale – It is a reflection of how well the different items complement each other in their measurement of different aspects of the same variable or quality – Interpret like a correlation coefficient (0.70 is good)

Kappa

-evaluates how much better then random chance is the agreement between two tests/observers


-Incorporates random error – chance


-Proportion of observed agreement not due to chance in relation to the maximum non-chance agreement


-Ranges from –1 to 1

Guidelines for Kappa

> 0.75 excellent agreement 0.40 – 0.75 intermediate to good agreement < 0.40 poor agreement

4 forms of validity

– Face validity


– Content validity


– Criterion validity


– Construct validity

Face validity

Cursory review of survey items by untrained judges

Content validity

Subjective measure of how appropriate the items seem to a set of reviewers who have some knowledge of the subject matter


– Usually consists of an organized review of the survey’s contents to ensure that it contains everything it should and doesn’t include anything that it shouldn’t

Criterion validity

Measure of how well one instrument measures up against another instrument or predictor


– Assess with correlation coefficient or measure of agreement


– Sensitivity/specificity; discriminant analysis

Concurrent criterion validity

assess your instrument against a “gold standard”

Predictive criterion validity

assess the ability of your instrument to forecast future events, behavior, attitudes, or outcomes

Construct validity

Most valuable and most difficult measure of validity


• It is a measure of how meaningful the scale or instrument is when it is in practical use


• Based on way measure relates to other variables in a system of relationships

Convergent:

Implies that several different methods for obtaining the same information about a given trait or concept produce similar results – similar to alternate-form reliability


– But it is more theoretical and requires a great deal of work, usually by multiple investigators with different approaches

Divergent:

The ability of a measure to estimate the underlying truth in a given area; must be shown not to correlate too closely with similar but distinct concepts or traits

What are methods of data collection?

 Telephone


 Personal Interview


 Self-administered


 Computer


 Group administration


 Mail


 Internet

Way questionnaire is presented and mode affect many aspects

 Cost


 Time


 Response rates


 Response styles


 Scale responses:


-more use of extreme categories in self-administered

Issues with methods of data collection

-sampling


-type of population


-question form


-question consent


-response rates


-cost


-time

How will designating a respondent influence mode of data collection?

1. If sampling frame is list of individuals – any procedure is feasible


2. If specific respondent is necessary, Q mailed to household may not be best. Interviewer is needed

Ease of contact influences

interviewer admin or self-admin

Question format depends on

closed/open, self admin/interviewer

For telephone surveys you should

 Limit response scales


- 4 maximum (recommended)


 Long list must be read slowly and read again


 Order affects answers


 Break down several-category question into two phases

What cannot be adapted to the telephone?

 Complex descriptions of situations/events


 Questions requiring pictures or visual cues

Sensitive Topics: Self-administered procedures

No admitting socially undesirable behaviors directly to interviewer

Sensitive Topics: telephone procedures

Impersonality helps people report negative events or behaviors

Sensitive Topics: Personal Interviewers

 Build trust and rapport


 Social desirability bias

Telephone Response

 With random digit dialing, response lower than with personal interview


 Combined with rate of people without phones – nonresponse can be substantial

CASI, CATI, CAPI

-Computer Assisted Selfadministered Interviews


– Computer Assisted Telephone Interviewing (e.g., CDC/La-PRAMS)


- Computer Assisted Personal Interviewing

Advantages to Computer Assisted Formats

 Machine readable form


 Ease of complex skip patterns


 Questions adaptable based on previous answers


 Inconsistencies easily identified


 Reduction in missing data  Instructions given as needed


 Speed of data entry

Disadvantages to Computer Assisted Formats

No control over data entry


 Requires time to make and test


 Making corrections difficult


 Computer failures


 Need for fixed response questions

CASI reduces

Reduces social desirability bias

Personal Interview Advantages

 Face-to-Face


 Effective way to enlist cooperation


 Ability to answer respondent’s questions


 Probe


 Complex instructions/sequences


 For long questionnaires, best method

Personal Interview Disadvantages

 Costly


 Well-trained staff near sample is key


 Longer period for data collection


 Accessibility of sample

Telephone Surveys Advantages

 Lower cost


 Better access to some populations


 Shorter time frame


 Smaller staff possible, location not important


 Response rate better than mail

Telephone Surveys Disadvantages

 Sampling limitations with some populations (no phones)


 Non-response with RDD higher than personal interview


 Questionnaire constraints (see next slide)


 Sensitive questions??

Self-Administered Survey Advantages

 Question style


- Long questions possible


- Several similar questions possible


- Visual aids possible


 Sensitive questions

Self-Administered Survey Disadvantages

 Careful questionnaire design needed


 Question style


-Open ended not useful


 Literacy required


 No quality control


-Questions


-Respondent: who answers

Internet Surveys Advantages

 Low cost


 High speed of returns


 Advantages of self-administered


 Advantages of computer-assisted formats


 Respondent can take time to answer

Internet Surveys Disadvantages

 Sampling limitations


 Challenge in getting cooperation


 Limited open-ended question usefulness

What does the interviewer do?

• Locates and enlists the cooperation of respondents— recruitment


• Assesses eligibility


• Motivates respondents


• Asks questions, records answers, and probes incomplete answers


• QA/QC/TQM

Interviewers can be a source of error when:

 When they do not read questions as worded


 When they probe directly


 When they bias answers by the way they relate to respondents


 When they record answers inaccurately

Conservative estimates based on several observational studies suggest that interviewers change question wording at a rate of

20- 40%

Probing

-Required when the initial reading of the question does not provide a satisfactory answer


 Usually begin by repeating the question


 Follow with non-directive probes:


-How do you mean that?


-Tell me more about that


-Anything else?


-Just choose the best answer

Probing: If a definition is provided with the question, interviewer should

reread the definition

Probing: If no definition is provided

the respondent should answer the question with the interpretation that seems best to them

Probes should be included in the

training and SOP Manual

Recording answers allows

No interviewer judgment


 No interviewer summaries


 No interviewer effects


 Open-ended answers should be recorded verbatim

How to detect interviewer-related error

 Direct observation of interviewers


 Validating survey answers


 Associating interviewers with the answers they obtain

There will always be an effect of the interviewer in a

face to face interview

Protocol/plan:

A guide for a study

Research proposal

a plan written to seek approval for research from a supervisor or organization

Protocols Contail

Description of study: abstract, questions or hypotheses, goals and objectives, methods (survey design, sampling, sample size, instruments, reliability and validity, data collection methodology, nonresponse, etc.):  What you plan to do  Who will do it  To whom it will be done  When it will be done  Where it will be done  What you hope to learn  Procedures, guidelines, responsibilities  Serves as training guide, instructions, reference

Survey Instrument for interviews

A script for interviewers, including introductions, instructions, and questions

Survey instrument for self-admin surveys

questionnaire (Includes all questions for subjects, instructions, and response sets)

Essential steps in developing a survey instrument

-Statement of purposes


 What do you want to accomplish with the survey? Research questions, objectives


-List of the variables to be measured


 Group into logical categories


-Draft analysis plan  What are your dependent, independent variables? What are potential confounders?

Preliminary Question Design Steps

1. Interdisciplinary research group


a. What are the research questions/ objectives/hypotheses?


2. Focus groups


3. Draft questions


4. Cognitive laboratory interviews


5. Formal pre-testing


6. Pilot

Focus group objectives

To compare the reality about which respondents will be answering questions with the abstract concepts embedded in the study objectives

Cognitive Laboratory Interviews what is it and what is the goal

 Respondents are brought into a laboratory setting


-May be videotaped


-Interviews conducted by cognitive psychologist or experienced investigator



Goal: To get information about how the respondent understood the questions and about the way they answered them

What kind of people do we choose for cognitive lab interviews?

Choose people who represent range of people to be interviewed in full-scale survey

Pre-Testing Goal



Field pre-testing

Find out how well the data collection protocols and survey instruments work under realistic conditions

Quantitative methods for pre-testing

1. Ask interviewers to fill out a rating for each question


2. Taping and behavior coding—systematic, reproducible

Pre-testing Self-administered format

Have respondents (who are similar to your survey population) fill out the survey and then discuss

Pilot test

Walk-through of entire study design


 Should differ from final survey only in scale


 Use representative sample of target population

Interview Format

Should have everything scripted, including introductions, instructions, transitions, definitions, and explanations

Self-Administered Format

 Questionnaire should be self-explanatory (minimal instructions needed)


 Limit to closed questions


 Use short questions with consistent formats


 Minimize skips; make them very clear

Open-ended questions, types and issues

 Short, specific


-What is your current age?


 Long, narrative


-Why did you choose to come to this clinic?


 Problems:


-Illegible handwriting


-Inappropriate detail


-Usually avoid in quantitative surveys

Open-ended Questions

 Advantages


-Permit unanticipated answers


-May better describe view of respondent


-Respondents can use their own words

Closed-ended Questions advantages

 Generally preferable


-Respondent can perform more reliably when response is given


 Researcher can interpret more reliably


 Increase likelihood of finding something meaningful

Close-ended question types

 Yes/No Questions


 Checklist Questions


-From a list of alternatives, check those that apply


-Problematic because you can’t distinguish a “No” from a skip


-Yes/No may be better because it forces thought

Multiple-choice questions

 Response alternatives should be mutually exclusive and exhaustive

Semantic differential questions

Two opposite adjectives at the ends

Ranking questions

Present alternatives and ask respondents to rank them

Rating Scales

Present a respondent with a question or statement and a range of responses


-Distance between alternatives should be equal


-Usually 3 to 7 are recommended

Unipolar response alternatives

Range from “nothing” to “a great deal”

Bipolar alternatives

Range from “large negative” through “zero” to “large positive”

Balanced scales

 Should have equal numbers on either side of neutral


 Unbalanced scales will lead to bias

For very complex, emotional issues, you may want to have 2

middle points on rating scale

Behaviorally anchored scales

 Objective, quantitative

Subjective Scales

often, sometimes, never etc

Behaviors to identify problems with instrument

 Question not read as worded


 Whether respondent asks for clarification


 Whether respondent gives response requiring probing

Presence of 1 Q can affect

answers to subsequent questions

Order of Self-administered ?'s:

begin with interesting (but not threatening) questions

Order of Interview ?'s

gain rapport, introduce questionnaire, enumerate members of household, then move into area of attitudes, sensitive topics

Cognitive Assessment is used to

1. evaluate questions in survey interviews


2. Way to learn how participants answer questions


3. locate sources of error


Cognitive Process Models

1. Comprehension


-Understand terminology


2. Interpretation


-What info is being asked, what asked to do


3. Recall


-Retrieval of info from memory


4. Judgment


-About what information to provide


-Decision based on emotions, feelings, social norms


5. Response


-Given in form requested


-Translation of internal thoughts into words or predetermined response

Ways to reduce response distortion to sensitive questions

1. Assure confidentiality of response


2. Communicate importance of accurate response


3. Reduce role of interviewer in data collection process

Subjective States:

1. Attitudes


2. Decisions


3. Needs


4. Behaviors


5. Lifestyle Patterns


6. Affiliations

Steps to improve validity of subjective measures

1. Make questions reliable as possible


2. When ordering, more categories than fewer usually better, but need to discriminate among categories


3. Ask multiple questions, with different forms, make scale


Racial Assignments are devised for

Devised for social and political reasons not scientific

survey system

collection of validated data

In what ways in information collected directly or indirectly

directly: asking people



indirectly: reviewing written, oral, visual records of people's thoughts and actions or by observing people in natural or experimental settings

Advantages of contracting data management

-specialized expertise


-potential ability to access national network of personnel


-reduction of load on study personnel


-third party (without financial or professional stake in results) increases legitimacy of the results

Disadvantages of contracting data management

-more expensive


-lose direct control over quality of data and study conduct


-may be more difficult to interpret data without having done the analysis

Data Management activities include

1. drafting analysis plan


2. creating a code book


3. establishing reliable coding


4. reviewing surveys for incomplete or missing data


5. entering data and validating accuracy of the entry


6. cleaning the data

Data analysis plan does what?

1. summarize methods


2. for each survey objective, identify and describe the relevant variables


3. identify the analysis methods


4. describe plan for handling (missing values, outliers, zeros if log transformed, data collapsing


5. describe subgroup or by group analyses


6. set up dummy tables and graphs

Transcription errors

any time someone records an answer or number incorrectly

coding decision errors

misapplication of the rules for equating answers and code values

Ways to reduce error with the data code

-clear rules


-missing data codes for unanswered


-consistency


-exhaustive and non-overlapping categories


-use of reliability measures, Kappa

3 ways to do data entry

1. Data entered from a coded survey into a database or spread sheet


2. data entered directly into statistical program


3. Respondent or interviewer enters response directly into the computer

Advant/disadvant of scan-tron sheets

A: speed over manual entry, greater accuracy


D: difficult to use code sheet, scanner has rigid tolerances

Procedures for validating and verifying data

1. run frequencies for categorical variables


2. run univariate statistics fir continuous variables


3. examine key variables (those used in the evaluation of primary objectives)


4. look at variables by group


5. Missing values


6. calculate checks for error prone variables


7. derive any key variables that need to be calculated from other variables and verify them too


What is an IRB and what does it do?

-Institutional Review Board


-evaluates aspects of human research involving human subjects



-other countries call it ethics committee or ethical review board

What is the primary responsibility of the IRB

Protect rights and welfare of research subjcts

Federal agencies that require IRB approval

-Department of Health and Human Services (NIH, CDC)


-FDA

What is the National Research Act of 1974

-National Commission for Protection of Human Subjects of Biomedical and Behavioral Research was established


-Commission issued Belmont Report in 1978


-based ethical principles for human research subjects

Practice

Interventions that are solely designed to enhance the well-being of an individual patient that have a reasonable expectation of success

Research

Designates an activity designed to test a hypothesis, permit conclusions to be drawn and thereby to develop or contribute to generalizable knowledge

Reasearch is a..... not a....

Privilege, right

Public trust is achieved through

accountability

Accountability is acheived through

Record keeping

The 3 basic ethical principles of ethical principles

-respect for persons


-beneficence


-justice

Respect for Person

-self rule


-informed consent


-acknowledge of the right of an individual to hold views, choose, and act based upon personal goal, values, and beliefs


-diminished autonomy

Beneficence

-above all, do no harm


-benefit must be provided either directly to the participant and/or indirectly to society in the form of generalizable knowledge


Risks of research are justified by the potential

benefits, ideally to the subject, but at least to society

Justice

Distribute the risks and potential benefits of research equally among those who may benefit from the research

According to the rule of justice, who should not be included in research?

vulnerable subjects

IRB shall determine that

1. risks to subjects are minimized


2. risks to subjects are reasonable in relation to the anticipated benefit


3. Equitable selection of subjects


4. Informed consent process is provided before the research begins and is properly documented


5. Subjects vulnerable to undue influence or coercion are adequately protected


6. Subject Privacy and confidentiality of data are maintained

Types of submissions to the IRB

1. new submissions


2. changes to approved research


3. continuing review (at least yearly)


4. serious adverse events


5. Protocol deviations

IRB Categories of research

1. exempt


2. expedited


3. Full-board


4. CIRB (National Cancer Institute Studies)


-Submission is reviewed for expedited or full board status, pre-reviewed for completeness and then routed to appropriate reivewers

IRB exempt research protocols

-educational settings or tests, surveys, interviews, observation in public places (without identification or potential for civil liability, or of public officials)


-collection of existing documents, that are public


-projects under the direction of federal agency that examine public benefit programs


-consumer acceptance studies, studies of food taste and quality


-emergency use of test article (must be reported to IRB with in 5 days)

IRB expedited review

-must be no more than minimal risk


-non invasive collection of biological specimens


-drugs where IND not required


-retrospective or recorded data


-continuing review of studies that are closed to enrollment, in follow-up or study in data analysis only

IRB full-board review

-protocols are reviewed by a primary and secondary reviewer and presented to the convented IRB for deliberation and vote


-can give feedback to PI


-changes must be made for final approval

Survey research is often exempt from IRB oversight unless

identifying information is collected and the disclosure of such information may cause harm to the subjects

What are not considered human subjects

establishment surveys (groups/orgs)

Risk

Potential harm, discomfort or inconvenience associated with the research that a reasonable person would be likely to consider significant in deciding whether or not to participate in the research

Minimal risk

the probability and magnitude of harm or discomfort anticipated in the research are not greater in and of themselves than those ordinarily encountered in daily life or during the performance of routine physical or psychological examination or tests

Examples of interventions that are no greater than minimal risk

-blood sampling


-urine collection


-MRI


-Routine x-ray


-saliva swab


-standard psych testing


-sexual history survey

Examples of interventions that are greater than minimal risk

-indewelling catheter


-urinary catheter


-MRI with IV contrast


-CT scan


-skin biopsy


-extensive psych tests


-sexual abuse survey

Associated risks

-harm or loss


-inconvenience


-psychological


-social


-economic


-legal

Components of informed consent

-allowed to ask questions


-can voluntarily withdraw


-informed of research findings


-fully informed about purpose of study


-know benefits and risks

Surrogate surveys

elicit information from one person about another



two participants: respondent and the subject of the survey



protection and consent apply to both

IRB Mandate

Balance the need to study difficult/sensitive questions in vulnerable populations while protecting human subjects


-includes experienced survey researchers

Descriptive Statistics
to summarize the characteristics of a sample
inferential statistics
to test statistics (the information obtained from a sample infers to the characteristics of the pop as a whole.
Parametric
under certain assumption about the distribution of the variables in the population from which the sample was drawn
non-parametric
under NO assumption of the distribution of the variables
Univariate
to describe the survey sample profiles or characteristics
Bivariate
to examine the relationship between two variables
Multivariate
to examine the relationship among an array of variables
existence vs. strength of an association
existence-chi sq, t-test, anova



strength: RR , correlation coefficient, etc.

independent sample:
in cross-sectional surveys there is no effort to re-interview the sample people, the groups being compared are independent of another
related sample
in longitudinal design (studies), the same people are interviewed at different points in time, the groups being compared are not independent of one another
Excluding all incomplete information from analysis may
lead to biased estimates of parameters because it assumes that the complete observations are representative of all observations
imputation
converting incomplete data into a data set in which values for all people included in the study are present
imputation methods
-multiple copies of original data set made

-maximum likelihood and maximum pseudo-likelihood methods


-weighted estimating eqn methods

Multiple copies of the original data set made
-each with missing values randomly generated

-results from different data sets combined to take variability into account

maximum likelihood and maximum pseudo-likelihood methods
take outcome and covariate distributions into account
weighted estimating eqn methods
weights for regression analysis provided by a model for missing data
We stratify to test for
interaction
Nominal
No numerical value, categorical scales, categorical data, categories mutually exclusive with no relationship between
Ordinal scale
a scale with choices that have inherent order



often for ratings of quality or agreement

Numerical scale
a scale on which differences between numbers have meaning
independent variables
exposures, explanatory or predictor variables used to explain or predict
dependent variables
outcomes, responses, results
Correlation
the relationship between 2 variables
Coefficient of determine R^2
proportion of variation in the dependent variable associated with variation of change in the independent variable
spearmans rank correlation
-relationship between 2 ordinal characteristics or 1 ordinal and 1 numerical characteristic

-used with numerical data when observations are skewed, with outliers

Regression
predicts the dependent variable by using a set of independent variables
What test do you use when you compare 1 nominal independent variable with respect to one numerical dependent variable
2 -sample t-test
Assumptions of statistical testing
-data normally distributed

-variances equal


-if sample sizes equal, unequal variances not major effect on significance level


-if not equal, adjust degrees of freedom, and use separate variance estimates


-f-test can compare variances



Nonparametric test
wilcoxon rank-sum teat
Use pooled variance estimate when
variances are equal, otherwise use separate estimates
ANOVA
compares means of three or more groups



tells you about overall status of differences among groups

chi-square
-Compare proportions

-allow us to compare expected freq in each cell when frequency actually occurs


-if relationship exists, two variables are said to be dependent

index
simple accumulation of scores assigned to specific responses to individual items in the index
scales
constructed through assignment of scores to response patterns among several items in the scale
factor analysis
method to assess whether different items belong together in a scale
multitrait scaling analysis
-advanced technique measures how well groups of items hold together as scales

-used when looking at convergent and discriminant validity simultaneously


-2+ traits measure bu two or more methods at same time

factor analysis
-univariate procedure used to assess whether different items belong together in a scale

-a factor is a hypothetical trait thought to be measured by the items in a scale


-computer algorithm tests possible combinations of items and determines how they vary together

multivariable analysis

examination of the distribution of cases on one dependent and more than one independent variable.

What are the measures of central tendency

1. mean


2. mode


3. median



Measures of dispersion

1. range


2. standard deviation

univariate analysis

examination of the distribution of cases on only one variable at a time

Bivariate analysis

examination of the distribution of cases on one dependent and one independent variable

What graphs are better for visualizing trends?

Line Graph