Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
337 Cards in this Set
- Front
- Back
Census VS Survey |
A census enumerates the whole population. A survey evaluates samples (subsets) of population instead of the whole population. |
|
What is a survey? |
A system for collecting information to describe, compare, or explain knowledge, attitudes, and behavior. |
|
How do surveys collect this information? |
By asking individuals questions to generate statistics on the group(s) that those individuals represent. |
|
What is the main purpose of a survey? |
To produce statistics. |
|
What is the main way of collecting survey information? |
By asking people questions. |
|
From where do surveys collect information from |
A fraction of a population- a sample, instead of a whole population |
|
Areas that use surveys |
Political research, sociological research, epidemiologic research, psychological research, marketing research |
|
How many survey activities of a survey system are there? What are they? |
1. Setting objectives for information collection 2. Designing the study 3. Preparing a reliable and valid instrument 4. Administering the survey. 5. Managing survey data 6. Analyzing survey data 7. Reporting the results |
|
What is a research question? |
Conceived prior to the study. Tend to be bases on more extensive background literature and will lead to more refined efforts at data collection |
|
What is a hypothesis? |
A suggested explanation of a phenomenon or reasoned proposal suggesting a possible correlation between multiple phenomena. |
|
How do we form research questions? |
1. Always begin with a literature review 2. Formulate a hypothesis 3. Can you test the hypothesis with a survey? 4. Is the question relevant, important, timely? |
|
What's the most important characteristic of a research question? |
specific |
|
What to include to make a research question specific? |
Purpose, specific objective, target population, survey population? |
|
Three common ways of stating hypotheses |
1. Positive declaration 2. Negative declaration 3. Implicit question |
|
Positive Declaration |
Research hypothesis -The infant mortality rare is higher in one region than another
|
|
Negative Declaration |
Null Hypothesis -There is no difference between the infant mortality rates of two regions |
|
Implicit question |
To study the association between infant mortality and geographic region of residence
|
|
What are the methods of hypothesis formulation? |
1. Method of difference 2. Method of Agreement 3. Method of concomitant variation
|
|
Method of difference |
Recognizing the frequency of something is different in two sets of circumstances
|
|
Method of Agreement |
Single factor is common to number of circumstances in which disease occurs with high frequency |
|
Method of concomitant variation |
Frequency of factor varies in proportion to frequency of disease |
|
What terms must hypotheses must be put into |
Operational |
|
How are hypotheses tested? |
In the form of experiments and tests. |
|
Are hypotheses completely confirmed or rejected often? |
No |
|
Types of study designs |
Observational -Cross sectional -retrospective (non-concurrent) -prospective (concurrent) -Pre/post-test
Experimental/Quasi-experimental |
|
Summary of survey/research questions |
1. What are you interested in finding out 2. Whom do you want to study 3. Where are these people or organizations located 4. When do you want to do the survey 5. What do you expect to learn and why? |
|
What consists of the survey check list? |
1. Set objectives for information collection 2. Design research 3. prepare ad reliable and valid data collection instrument 4. analyze data 5. report the results |
|
National Health Interview Survey (NHIS) |
Multipurpose health survey conducted by the NCHS, CDC and is the principal source of information on the health of the civilian, non-instutionalized, household pop of the US
Started in 1957 |
|
DHS Surveys (demographic and health surveys) |
-large sample sizes, representative -provide data for a wide range of monitoring and impact evaluation indicators in the areas of population, health and nutrition. -every 5 years -standard core questionnaire and other questionnaire (women's questionnaire) |
|
Population council |
Leader in conducting studies on a range of reproductive health issues -horizons AIDSQuest |
|
NHANES (National health and nutrition examination survey) |
-conducted by the center for health statistics/CDC -collect info about the health and diet of people in the US -home interview and mobile exam center |
|
Why sample? |
Time, cost, quality/accuracy |
|
Three general objectives of surveys |
-Description (distribution of traits or attributes) -Comparison and Explanation (multivariate analysis) -Exploration (raise ideas, not answer research questions, not describe pop) |
|
Units of analysis |
1. Individual, often 2. Aggregate (families, cities, nation) sometimes |
|
Can a survey include more than one unit of analysis? |
Yes |
|
Ecologic Fallacy |
Bias that may occur because an association observed between variables on an aggregate or group level does not necessarily represent the association that exists at an individual level |
|
Element |
unit about which data is collected and which provides the basis of analysis |
|
Universe |
theoretical and hypothetical aggregation of all elements as defined for a specific survey |
|
Population |
theoretically specified aggregation of survey elements (working and operationalized definition, includes careful specification) |
|
survey population |
aggregation of elements from which the survey sample is actually selected
universe/population |
|
Sampling unit |
elements or set of elements considered for selection
aka: primary sampling unit (PSU) single stage sampling=elements multi-stage=different levels of sampling units |
|
sampling frame |
actual list of sampling units from which sample is selected |
|
observation unit |
element or aggregation of elements from which information is collected |
|
Often unit of analysis= |
unit of observation
but not always |
|
variables |
set of mutually exclusive characteristics |
|
parameter |
summary description of variable in population |
|
statistic |
summer description of variable in sample |
|
sampling error |
estimate of degree of random error to be expected from given sample design |
|
Can you always ensure that all those in a population have a chance to be selected? |
No ,hardly
Limit population to help with this |
|
Probability sampling |
-representative -every member of the target population has a known non-zero probability of being included in the sample -implies the use of random selection -evaluate sample by the way it was selected |
|
Three key issues of probability sample designs |
-sample can only be representative of population includes in sample frame -each person must have a known chance of selection -no researcher discretion -size and specific procedures of sampling design will influence precision of sample characteristics and correspondence to population |
|
Quality of sampling directly associated with |
quality of sampling frame |
|
T/F Sampling frames cannot occur spontaneously. |
False, but they should be documentable |
|
The findings of the sample survey are only representative of |
aggregation of elements that comprise the sampling frame |
|
Sampling frames do not truly include all |
elements their names imply |
|
Elements should only appear |
once |
|
Types of samples |
-simple random sampling: simple random sample, systematic samples -stratified sampling: can be used with both SRS and MRS -Multistage random sample (MRS): two stage or more, area probability samples, multistage cluster |
|
Simple random sampling |
-basic sampling method assumed in stat analysis -assign each element a unique number, then use random number table or generator to select sample -each element is sampled from the sampling frame one at a time, independent of one another and without replacement |
|
Issues in simple random sampling |
1. usually not possible 2. laborious process 3. due to natural variability, may end up with non-representative sample 4. Less precision than desired may result |
|
Systematic Sampling |
-commonly used instead of simple random sample if sampling frame is available -every kth element selected for inclusion -random start to ensure no bias
|
|
sampling interval |
standard distance between elements |
|
sampling ratio |
proportion of elements in population selected |
|
How to get sample interview and ratio |
Want n=1000 Have N=10,000
1000/10000=1/10 interval/k =10 (round down if not an integer) ratio=1/10th |
|
Issues in systematic sampling |
-Periodicity -if there is periodicity in list, need to randomly reorganize before starting -easier to do than simple random sample -known probability, no researcher discretion |
|
Stratified random sampling |
-Population is divided into subgroups and a sample is then randomly selected from each strata -Appropriate numbers of elements are drawn from homogenous subsets of the population (heterogeneity between subsets) |
|
Stratified sampling does not occur instead of simple random sample, but |
in addition to |
|
Both SRS and SS ensure |
a degree of representativeness and permit estimate of error |
|
Stratified sampling permits a greater degree of |
representativeness, thus decreasing probable sample error |
|
Stratified increases |
representativeness, efficiency |
|
Stratified sampling makes more heterogenous or homogenous sample |
homogenous |
|
Stratified sampling decreases |
error and variability because it ensures appropriate numbers of elements from homogenous groups |
|
Disproportionate sampling |
-oversample stratum with variability to increase precision of estimate -increases n for subpopulation without increasing total N -important to weight data accordingly in analysis (or oversampled group could be counted more) |
|
NHANES oversamples |
African americans, mexican americans, low income white americans, 12-19 years. 60+years |
|
When are other places/times to weight data |
-corrections of errors in selection, non-response, proportionate sampling where initial figures were incorrect |
|
NHANES sample weight |
-assigned to each sample person -is a measure of the number of people in the population represented by that sample person -reflects unequal probability, nonresponse adjustment, and adjustment to independent population controls |
|
When do you sometimes choose Multistage Sampling |
when no sampling frame of elements exists |
|
The two steps of multi-stage sampling |
-listing then sampling -with stratification as desired at either or both levels of sampling |
|
Issues with multistage sampling |
-improved efficiency, but decreased accuracy -subject to two(+) sampling errors -issue of representativeness on two work levels |
|
A cluster is |
a naturally occurring unit |
|
Sampling error is reduced by two factors |
-increased sample size -increase homogeneity of elements being sampled |
|
A sample of clusters will best represent all clusters if |
a large number are selected and if all clusters are similar |
|
A sample of elements will best represent all elements in a given cluster if |
large number of elements are selected and if all elements are similar |
|
Clusters are less representative than |
a simple random sample |
|
Elements with in clusters are usually more homogeneous than |
than all elements of the population |
|
If you increase the number of clusters you must |
decrease the number of elements |
|
Area probability sampling |
-type of multi-stage sampling strategy -divide land area into exhaustive mutually exclusive sub-areas -key: random selection |
|
Random Digit Dialing |
-version of multi-stage sampling -relatively easy -inexpensive -large number of unfruitful calls -excludes those without phones -may be difficult to ascertain geographic area -reliance on cell phones as main phone -may randomly select the four-digit suffixes |
|
For RDD you delineate...
Then you... |
geographic boundaries of sampling area
identify all exchanges used in geographic area and distribution of prefixes within area |
|
For RDD you may stratify based on |
the distribution of prefixes |
|
RDD provides a nonzero |
chance of reaching any household within a sampling area that has a telephone line regardless of whether the number is listed
|
|
Is the probability of reaching every household equal for RDD? |
No, households with more phones have greater probability -adjust for unequal probability by weighting |
|
Non-probability sampling |
sampling in which some members of the eligible target population have a chance of being chosen for participation in the survey and others are not |
|
Interviewer discretion/or respondent characteristics |
affect the likelihood of being included in a sample |
|
Accidental NPS |
whatever cases happen to be available |
|
Purposive NPA |
planned selection of specific types of cases |
|
When to do NPS |
1. When probability sampling is too expensive 2. When purposive or judgmental sampling is better, so can create own sample to meet needs of study 3. hard to reach populations 4. pilot studies 5. surveys of specific groups |
|
Common methods of non-probability sampling |
1. convenience 2. most similar/dissimilar cases 3. typical cases 4. critical cases 5. snowball 6. quota |
|
Convenience NPS |
select cases based on their availability for the study |
|
Most similar/dissimilar cases NPS |
select cases that are judged to represent similar conditions or, alternatively, very different conditions |
|
Typical Cases NPS |
select cases that are known beforehand to be useful and not to be extreme
|
|
Critical Cases NPS |
select cases that are key or essential for overall acceptance or assessment |
|
Snowball NPS |
group members identify additional members to be included in sample |
|
Quota NPS |
interviewers select sample that yields the same proportions as the population proportions on easily identified variables |
|
Quota sampling the population is divided into... and the numbers of |
subgroups
subgroup members in the sample are proportional to the numbers in the larger population |
|
What do you need to know for quota sampling? |
Population porportions |
|
Quota sampling is done in effort to |
reduce/minimize bias
|
|
Quota sampling ensures |
proportionate representation of various categories or respondents |
|
In quota sampling, sample elements can be weighted so... |
the data provides a reasonable representation of total population |
|
Problems with NPS designs |
1. not necessarily representative of population 2. cannot necessarily apply findings to population 3. need to test sample and population characteristics to see how same or differe 4. May be useful for what you are doing 5. know limiations |
|
Additional NP techniques |
-focus groups: group asked opinions about a product/service/concept -key informant interviews: qualitative in-depth interview with people who have knowledge or expertise in the field of interest, community, etc. |
|
Advantages of Stratified VS SRS |
1. Homogeneity within each stratum – smaller variation within strata produces stratified sampling estimators with smaller variance than SRS estimators – for the same sample size. 2. Separate estimators for population parameters can be obtained for each stratum 3. Cost can be less for STRS than SRS if strata represent locations. |
|
Advantages of Cluster Sampling: |
Feasibility: The only frames available are lists of clusters.
Economy: Listing costs and traveling costs tend to be lower for cluster sampling ( the field costs are lower). |
|
Single -Stage Cluster Sampling |
All listing units in the chosen clusters are selected |
|
Two-Stage Cluster Sampling |
First selecting a simple random sample of clusters and then selecting a simple random sample of listing units from each sampled cluster |
|
Disadvantages of Cluster Sampling |
Sample estimates based on cluster samples tend to have higher standard error than those based on other sampling plans, for the same sample size |
|
3 factors that influence sample representativeness |
sampling procedure sample size participation (response) |
|
sampling error |
variation around the true population value of samples by chance |
|
Bias |
systematic way the people responding to a survey are different from the target population as a whole |
|
SE of a mean equation |
se=√var/n |
|
SE of proportions |
Se=√p(1-P)/n |
|
How is sampling error reduced |
homogenous populations, large sample |
|
Criteria for estimating survey sample size |
1. identify major study variables 2. determine the types of estimates of study variables such as means or proportions 3. select the population or subgroups of interest 4a. indicate what you expect the population value to be 4b. estimate the SD 5. Decide on level of confidence 6. decide on tolerable range of error in estimate 7. compute sample size based on study assumptions |
|
Formulas for precision |
|
|
When sample size increases, what happens to variability and power? |
They both increase |
|
Does increasing the sample size reduce systematic error? |
No |
|
Non-sampling errors arise from: |
poor definitions of the target population and non-response |
|
Non-sampling errors can lead to: |
bias, inability to interpret results |
|
Non-sampling errors violate |
the point of probability sampling |
|
What is non-response? |
-The failure to collect data from subjects selected to be in a sample -leads to biased estimates (usually unsure of direction) -potentially one of the most important sources of systematic error (bias) in a study |
|
What are the 4 categories of nonresponders |
1. Those whom the data collection procedures do not reach. 2. Those who refuse to provide data 3. Those who are unable to provide data 4. Item non-response |
|
Response rate |
number of respondents/number sampled x100 |
|
Minimum response rate is |
75%
-adequate 50% -good 60% -very good 75% |
|
If non response does not depend on outcome then is the chance of bias greater or less |
less |
|
bias due to non-response is a function of: |
the response rate, the extent to which non-responders differ from the same population |
|
As non response increases, the range of possible bias... |
increases |
|
Response rate in rural areas is ...than urban areas |
higher |
|
Response rate is _ when there is a designated respondent that if you take any responsible adult |
lower |
|
response rates are higher for topics that |
interest people |
|
Why are response rates lower in urban areas? |
-Higher proportion of single individuals -more high-rise dwellings (less accessible) -interviewers may be uncomfortable in urban areas at night |
|
Response rates frequently differ by |
data collection method |
|
Who is more likely to respond to mail surveys? |
those with an interest in the topic and better educated people |
|
Ways to maximize response rates in mail surveys |
-make it attractive and professional -professional endorsement -pre-notification -offer a monetary incentive -quick, easy to ready and complete -include stamped, addressed return envelopes -reminders -follow up non-responders by telephone -ensure anonymity (where needed) |
|
Confidential |
Keep name/personal information with access limited to specific people (or classes of people |
|
Anonymous |
No personal identification given – completely disconnected from person’s identity |
|
How to max response in phone and personal interviews |
1. vary call times, repeat unsuccessful calls 2. have flexible interview times 3. send informational letters ahead of time 4. communicate the survey purpose and importance to participant 5. use effective interviewers-training |
|
How to correct for non-response |
1. use proxy respondents 2. use statistical adjustments such as weighting subgroups to match their rate in the same population 3. survey non-responders |
|
Wave analysis |
1. compare those who responded in your first wave to those who responded only after follow-up – Assume that late responders are similar to nonresponders; test this 2 Calculate the pattern of response in nonresponders needed to reverse the study conclusions; analyze worst/best case scenarios |
|
When might you sample the entire population? |
When your population is very small When you have extensive resources When you don’t expect a very high response |
|
A reliable survey is |
consistent |
|
A valid survey is |
accurate |
|
precisions is not always |
necessary or desirable |
|
Accuracy is a better reflection of the |
real world |
|
Reliability is a statistical measure of how |
reproducible a survey instrument's data are |
|
Lack of reliability may arise from divergences between |
observers or instruments of measurement or instability of the attribute being measured |
|
Measurement error |
How well or poorly a particular instrument performs in a given population |
|
How to maximize reliability |
Ask people only questions they’re likely to know answers to Ask about things relevant to them Be clear in what you’re asking |
|
Reliability commonly assessed in three forms |
– Test-retest reliability – Alternate-form reliability – Internal consistency reliability |
|
Test-retest reliability |
• Most common form in surveys • Measured by having the same respondents complete a survey at two different points in time to see how stable the responses are • Usually quantified with a correlation coefficient (r value) • In general, r values are considered good if r 0.70 -- Varies by field |
|
Major source of poor reliability |
poor questions |
|
To test-retest observer |
have the same observer make two separate measurements or comparison between the two measurements is intraobserver reliability |
|
For test-retest make sure its over_ |
a short period of time with items or scales that measure variables that are likely to change quickly |
|
Potential problem with test-retest is the
and what does it do |
practice effect – Individuals become familiar with the items and simply answer based on their memory of the last answer (practice effect)
can inflate reliability estimates |
|
Alternate-form reliability |
Questions or responses are reworded or their order is changed to produce two items that are similar but not identical
It is common to simply change the order of the response alternatives
use same level of difficulty
|
|
split-halves method |
You can measure alternate-form reliability at the same timepoint or separate timepoints
If you have a large enough sample, you can split it in half and administer one item to each half and then compare the two halves |
|
Internal consistency reliability |
Applied not to one item, but to groups of items that are thought to measure different aspects of the same concept
• If internal consistency is low you can add more items or re-examine existing items for clarity |
|
Cronbach’s coefficient alpha |
– Measures internal consistency reliability among a group of items combined to form a single scale – It is a reflection of how well the different items complement each other in their measurement of different aspects of the same variable or quality – Interpret like a correlation coefficient (0.70 is good) |
|
Kappa |
-evaluates how much better then random chance is the agreement between two tests/observers -Incorporates random error – chance -Proportion of observed agreement not due to chance in relation to the maximum non-chance agreement -Ranges from –1 to 1 |
|
Guidelines for Kappa |
> 0.75 excellent agreement 0.40 – 0.75 intermediate to good agreement < 0.40 poor agreement |
|
4 forms of validity |
– Face validity – Content validity – Criterion validity – Construct validity |
|
Face validity |
Cursory review of survey items by untrained judges |
|
Content validity |
Subjective measure of how appropriate the items seem to a set of reviewers who have some knowledge of the subject matter – Usually consists of an organized review of the survey’s contents to ensure that it contains everything it should and doesn’t include anything that it shouldn’t |
|
Criterion validity |
Measure of how well one instrument measures up against another instrument or predictor – Assess with correlation coefficient or measure of agreement – Sensitivity/specificity; discriminant analysis |
|
Concurrent criterion validity |
assess your instrument against a “gold standard” |
|
Predictive criterion validity |
assess the ability of your instrument to forecast future events, behavior, attitudes, or outcomes |
|
Construct validity |
Most valuable and most difficult measure of validity • It is a measure of how meaningful the scale or instrument is when it is in practical use • Based on way measure relates to other variables in a system of relationships |
|
Convergent: |
Implies that several different methods for obtaining the same information about a given trait or concept produce similar results – similar to alternate-form reliability – But it is more theoretical and requires a great deal of work, usually by multiple investigators with different approaches |
|
Divergent: |
The ability of a measure to estimate the underlying truth in a given area; must be shown not to correlate too closely with similar but distinct concepts or traits |
|
What are methods of data collection? |
Telephone Personal Interview Self-administered Computer Group administration Internet |
|
Way questionnaire is presented and mode affect many aspects |
Cost Time Response rates Response styles Scale responses: -more use of extreme categories in self-administered |
|
Issues with methods of data collection |
-sampling -type of population -question form -question consent -response rates -cost -time |
|
How will designating a respondent influence mode of data collection? |
1. If sampling frame is list of individuals – any procedure is feasible 2. If specific respondent is necessary, Q mailed to household may not be best. Interviewer is needed |
|
Ease of contact influences |
interviewer admin or self-admin |
|
Question format depends on |
closed/open, self admin/interviewer |
|
For telephone surveys you should |
Limit response scales - 4 maximum (recommended) Long list must be read slowly and read again Order affects answers Break down several-category question into two phases |
|
What cannot be adapted to the telephone? |
Complex descriptions of situations/events Questions requiring pictures or visual cues |
|
Sensitive Topics: Self-administered procedures |
No admitting socially undesirable behaviors directly to interviewer |
|
Sensitive Topics: telephone procedures |
Impersonality helps people report negative events or behaviors |
|
Sensitive Topics: Personal Interviewers |
Build trust and rapport Social desirability bias |
|
Telephone Response |
With random digit dialing, response lower than with personal interview Combined with rate of people without phones – nonresponse can be substantial |
|
CASI, CATI, CAPI |
-Computer Assisted Selfadministered Interviews – Computer Assisted Telephone Interviewing (e.g., CDC/La-PRAMS) - Computer Assisted Personal Interviewing |
|
Advantages to Computer Assisted Formats |
Machine readable form Ease of complex skip patterns Questions adaptable based on previous answers Inconsistencies easily identified Reduction in missing data Instructions given as needed Speed of data entry |
|
Disadvantages to Computer Assisted Formats |
No control over data entry Requires time to make and test Making corrections difficult Computer failures Need for fixed response questions |
|
CASI reduces |
Reduces social desirability bias |
|
Personal Interview Advantages |
Face-to-Face Effective way to enlist cooperation Ability to answer respondent’s questions Probe Complex instructions/sequences For long questionnaires, best method |
|
Personal Interview Disadvantages |
Costly Well-trained staff near sample is key Longer period for data collection Accessibility of sample |
|
Telephone Surveys Advantages |
Lower cost Better access to some populations Shorter time frame Smaller staff possible, location not important Response rate better than mail |
|
Telephone Surveys Disadvantages |
Sampling limitations with some populations (no phones) Non-response with RDD higher than personal interview Questionnaire constraints (see next slide) Sensitive questions?? |
|
Self-Administered Survey Advantages |
Question style - Long questions possible - Several similar questions possible - Visual aids possible Sensitive questions |
|
Self-Administered Survey Disadvantages |
Careful questionnaire design needed Question style -Open ended not useful Literacy required No quality control -Questions -Respondent: who answers |
|
Internet Surveys Advantages |
Low cost High speed of returns Advantages of self-administered Advantages of computer-assisted formats Respondent can take time to answer |
|
Internet Surveys Disadvantages |
Sampling limitations Challenge in getting cooperation Limited open-ended question usefulness |
|
What does the interviewer do? |
• Locates and enlists the cooperation of respondents— recruitment • Assesses eligibility • Motivates respondents • Asks questions, records answers, and probes incomplete answers • QA/QC/TQM |
|
Interviewers can be a source of error when: |
When they do not read questions as worded When they probe directly When they bias answers by the way they relate to respondents When they record answers inaccurately |
|
Conservative estimates based on several observational studies suggest that interviewers change question wording at a rate of |
20- 40% |
|
Probing |
-Required when the initial reading of the question does not provide a satisfactory answer Usually begin by repeating the question Follow with non-directive probes: -How do you mean that? -Tell me more about that -Anything else? -Just choose the best answer |
|
Probing: If a definition is provided with the question, interviewer should |
reread the definition |
|
Probing: If no definition is provided |
the respondent should answer the question with the interpretation that seems best to them |
|
Probes should be included in the |
training and SOP Manual |
|
Recording answers allows |
No interviewer judgment No interviewer summaries No interviewer effects Open-ended answers should be recorded verbatim |
|
How to detect interviewer-related error |
Direct observation of interviewers Validating survey answers Associating interviewers with the answers they obtain |
|
There will always be an effect of the interviewer in a |
face to face interview |
|
Protocol/plan: |
A guide for a study |
|
Research proposal |
a plan written to seek approval for research from a supervisor or organization |
|
Protocols Contail |
Description of study: abstract, questions or hypotheses, goals and objectives, methods (survey design, sampling, sample size, instruments, reliability and validity, data collection methodology, nonresponse, etc.): What you plan to do Who will do it To whom it will be done When it will be done Where it will be done What you hope to learn Procedures, guidelines, responsibilities Serves as training guide, instructions, reference |
|
Survey Instrument for interviews |
A script for interviewers, including introductions, instructions, and questions |
|
Survey instrument for self-admin surveys |
questionnaire (Includes all questions for subjects, instructions, and response sets) |
|
Essential steps in developing a survey instrument |
-Statement of purposes What do you want to accomplish with the survey? Research questions, objectives -List of the variables to be measured Group into logical categories -Draft analysis plan What are your dependent, independent variables? What are potential confounders? |
|
Preliminary Question Design Steps |
1. Interdisciplinary research group a. What are the research questions/ objectives/hypotheses? 2. Focus groups 3. Draft questions 4. Cognitive laboratory interviews 5. Formal pre-testing 6. Pilot |
|
Focus group objectives |
To compare the reality about which respondents will be answering questions with the abstract concepts embedded in the study objectives |
|
Cognitive Laboratory Interviews what is it and what is the goal |
Respondents are brought into a laboratory setting -May be videotaped -Interviews conducted by cognitive psychologist or experienced investigator
Goal: To get information about how the respondent understood the questions and about the way they answered them |
|
What kind of people do we choose for cognitive lab interviews? |
Choose people who represent range of people to be interviewed in full-scale survey |
|
Pre-Testing Goal
Field pre-testing |
Find out how well the data collection protocols and survey instruments work under realistic conditions |
|
Quantitative methods for pre-testing |
1. Ask interviewers to fill out a rating for each question 2. Taping and behavior coding—systematic, reproducible |
|
Pre-testing Self-administered format |
Have respondents (who are similar to your survey population) fill out the survey and then discuss |
|
Pilot test |
Walk-through of entire study design Should differ from final survey only in scale Use representative sample of target population |
|
Interview Format |
Should have everything scripted, including introductions, instructions, transitions, definitions, and explanations |
|
Self-Administered Format |
Questionnaire should be self-explanatory (minimal instructions needed) Limit to closed questions Use short questions with consistent formats Minimize skips; make them very clear |
|
Open-ended questions, types and issues |
Short, specific -What is your current age? Long, narrative -Why did you choose to come to this clinic? Problems: -Illegible handwriting -Inappropriate detail -Usually avoid in quantitative surveys |
|
Open-ended Questions |
Advantages -Permit unanticipated answers -May better describe view of respondent -Respondents can use their own words |
|
Closed-ended Questions advantages |
Generally preferable -Respondent can perform more reliably when response is given Researcher can interpret more reliably Increase likelihood of finding something meaningful |
|
Close-ended question types |
Yes/No Questions Checklist Questions -From a list of alternatives, check those that apply -Problematic because you can’t distinguish a “No” from a skip -Yes/No may be better because it forces thought |
|
Multiple-choice questions |
Response alternatives should be mutually exclusive and exhaustive |
|
Semantic differential questions |
Two opposite adjectives at the ends |
|
Ranking questions |
Present alternatives and ask respondents to rank them |
|
Rating Scales |
Present a respondent with a question or statement and a range of responses -Distance between alternatives should be equal -Usually 3 to 7 are recommended |
|
Unipolar response alternatives |
Range from “nothing” to “a great deal” |
|
Bipolar alternatives |
Range from “large negative” through “zero” to “large positive” |
|
Balanced scales |
Should have equal numbers on either side of neutral Unbalanced scales will lead to bias |
|
For very complex, emotional issues, you may want to have 2 |
middle points on rating scale |
|
Behaviorally anchored scales |
Objective, quantitative |
|
Subjective Scales |
often, sometimes, never etc |
|
Behaviors to identify problems with instrument |
Question not read as worded Whether respondent asks for clarification Whether respondent gives response requiring probing |
|
Presence of 1 Q can affect |
answers to subsequent questions |
|
Order of Self-administered ?'s: |
begin with interesting (but not threatening) questions |
|
Order of Interview ?'s |
gain rapport, introduce questionnaire, enumerate members of household, then move into area of attitudes, sensitive topics |
|
Cognitive Assessment is used to |
1. evaluate questions in survey interviews 2. Way to learn how participants answer questions 3. locate sources of error
|
|
Cognitive Process Models |
1. Comprehension -Understand terminology 2. Interpretation -What info is being asked, what asked to do 3. Recall -Retrieval of info from memory 4. Judgment -About what information to provide -Decision based on emotions, feelings, social norms 5. Response -Given in form requested -Translation of internal thoughts into words or predetermined response |
|
Ways to reduce response distortion to sensitive questions |
1. Assure confidentiality of response 2. Communicate importance of accurate response 3. Reduce role of interviewer in data collection process |
|
Subjective States: |
1. Attitudes 2. Decisions 3. Needs 4. Behaviors 5. Lifestyle Patterns 6. Affiliations |
|
Steps to improve validity of subjective measures |
1. Make questions reliable as possible 2. When ordering, more categories than fewer usually better, but need to discriminate among categories 3. Ask multiple questions, with different forms, make scale
|
|
Racial Assignments are devised for |
Devised for social and political reasons not scientific |
|
survey system |
collection of validated data |
|
In what ways in information collected directly or indirectly |
directly: asking people
indirectly: reviewing written, oral, visual records of people's thoughts and actions or by observing people in natural or experimental settings |
|
Advantages of contracting data management |
-specialized expertise -potential ability to access national network of personnel -reduction of load on study personnel -third party (without financial or professional stake in results) increases legitimacy of the results |
|
Disadvantages of contracting data management |
-more expensive -lose direct control over quality of data and study conduct -may be more difficult to interpret data without having done the analysis |
|
Data Management activities include |
1. drafting analysis plan 2. creating a code book 3. establishing reliable coding 4. reviewing surveys for incomplete or missing data 5. entering data and validating accuracy of the entry 6. cleaning the data |
|
Data analysis plan does what? |
1. summarize methods 2. for each survey objective, identify and describe the relevant variables 3. identify the analysis methods 4. describe plan for handling (missing values, outliers, zeros if log transformed, data collapsing 5. describe subgroup or by group analyses 6. set up dummy tables and graphs |
|
Transcription errors |
any time someone records an answer or number incorrectly |
|
coding decision errors |
misapplication of the rules for equating answers and code values |
|
Ways to reduce error with the data code |
-clear rules -missing data codes for unanswered -consistency -exhaustive and non-overlapping categories -use of reliability measures, Kappa |
|
3 ways to do data entry |
1. Data entered from a coded survey into a database or spread sheet 2. data entered directly into statistical program 3. Respondent or interviewer enters response directly into the computer |
|
Advant/disadvant of scan-tron sheets |
A: speed over manual entry, greater accuracy D: difficult to use code sheet, scanner has rigid tolerances |
|
Procedures for validating and verifying data |
1. run frequencies for categorical variables 2. run univariate statistics fir continuous variables 3. examine key variables (those used in the evaluation of primary objectives) 4. look at variables by group 5. Missing values 6. calculate checks for error prone variables 7. derive any key variables that need to be calculated from other variables and verify them too
|
|
What is an IRB and what does it do? |
-Institutional Review Board -evaluates aspects of human research involving human subjects
-other countries call it ethics committee or ethical review board |
|
What is the primary responsibility of the IRB |
Protect rights and welfare of research subjcts |
|
Federal agencies that require IRB approval |
-Department of Health and Human Services (NIH, CDC) -FDA |
|
What is the National Research Act of 1974 |
-National Commission for Protection of Human Subjects of Biomedical and Behavioral Research was established -Commission issued Belmont Report in 1978 -based ethical principles for human research subjects |
|
Practice |
Interventions that are solely designed to enhance the well-being of an individual patient that have a reasonable expectation of success |
|
Research |
Designates an activity designed to test a hypothesis, permit conclusions to be drawn and thereby to develop or contribute to generalizable knowledge |
|
Reasearch is a..... not a.... |
Privilege, right |
|
Public trust is achieved through |
accountability |
|
Accountability is acheived through |
Record keeping |
|
The 3 basic ethical principles of ethical principles |
-respect for persons -beneficence -justice |
|
Respect for Person |
-self rule -informed consent -acknowledge of the right of an individual to hold views, choose, and act based upon personal goal, values, and beliefs -diminished autonomy |
|
Beneficence |
-above all, do no harm -benefit must be provided either directly to the participant and/or indirectly to society in the form of generalizable knowledge
|
|
Risks of research are justified by the potential |
benefits, ideally to the subject, but at least to society |
|
Justice |
Distribute the risks and potential benefits of research equally among those who may benefit from the research |
|
According to the rule of justice, who should not be included in research? |
vulnerable subjects |
|
IRB shall determine that |
1. risks to subjects are minimized 2. risks to subjects are reasonable in relation to the anticipated benefit 3. Equitable selection of subjects 4. Informed consent process is provided before the research begins and is properly documented 5. Subjects vulnerable to undue influence or coercion are adequately protected 6. Subject Privacy and confidentiality of data are maintained |
|
Types of submissions to the IRB |
1. new submissions 2. changes to approved research 3. continuing review (at least yearly) 4. serious adverse events 5. Protocol deviations |
|
IRB Categories of research |
1. exempt 2. expedited 3. Full-board 4. CIRB (National Cancer Institute Studies) -Submission is reviewed for expedited or full board status, pre-reviewed for completeness and then routed to appropriate reivewers |
|
IRB exempt research protocols |
-educational settings or tests, surveys, interviews, observation in public places (without identification or potential for civil liability, or of public officials) -collection of existing documents, that are public -projects under the direction of federal agency that examine public benefit programs -consumer acceptance studies, studies of food taste and quality -emergency use of test article (must be reported to IRB with in 5 days) |
|
IRB expedited review |
-must be no more than minimal risk -non invasive collection of biological specimens -drugs where IND not required -retrospective or recorded data -continuing review of studies that are closed to enrollment, in follow-up or study in data analysis only |
|
IRB full-board review |
-protocols are reviewed by a primary and secondary reviewer and presented to the convented IRB for deliberation and vote -can give feedback to PI -changes must be made for final approval |
|
Survey research is often exempt from IRB oversight unless |
identifying information is collected and the disclosure of such information may cause harm to the subjects |
|
What are not considered human subjects |
establishment surveys (groups/orgs) |
|
Risk |
Potential harm, discomfort or inconvenience associated with the research that a reasonable person would be likely to consider significant in deciding whether or not to participate in the research |
|
Minimal risk |
the probability and magnitude of harm or discomfort anticipated in the research are not greater in and of themselves than those ordinarily encountered in daily life or during the performance of routine physical or psychological examination or tests |
|
Examples of interventions that are no greater than minimal risk |
-blood sampling -urine collection -MRI -Routine x-ray -saliva swab -standard psych testing -sexual history survey |
|
Examples of interventions that are greater than minimal risk |
-indewelling catheter -urinary catheter -MRI with IV contrast -CT scan -skin biopsy -extensive psych tests -sexual abuse survey |
|
Associated risks |
-harm or loss -inconvenience -psychological -social -economic -legal |
|
Components of informed consent |
-allowed to ask questions -can voluntarily withdraw -informed of research findings -fully informed about purpose of study -know benefits and risks |
|
Surrogate surveys |
elicit information from one person about another
two participants: respondent and the subject of the survey
protection and consent apply to both |
|
IRB Mandate |
Balance the need to study difficult/sensitive questions in vulnerable populations while protecting human subjects -includes experienced survey researchers |
|
Descriptive Statistics
|
to summarize the characteristics of a sample
|
|
inferential statistics
|
to test statistics (the information obtained from a sample infers to the characteristics of the pop as a whole.
|
|
Parametric
|
under certain assumption about the distribution of the variables in the population from which the sample was drawn
|
|
non-parametric
|
under NO assumption of the distribution of the variables
|
|
Univariate
|
to describe the survey sample profiles or characteristics
|
|
Bivariate
|
to examine the relationship between two variables
|
|
Multivariate
|
to examine the relationship among an array of variables
|
|
existence vs. strength of an association
|
existence-chi sq, t-test, anova
strength: RR , correlation coefficient, etc. |
|
independent sample:
|
in cross-sectional surveys there is no effort to re-interview the sample people, the groups being compared are independent of another
|
|
related sample
|
in longitudinal design (studies), the same people are interviewed at different points in time, the groups being compared are not independent of one another
|
|
Excluding all incomplete information from analysis may
|
lead to biased estimates of parameters because it assumes that the complete observations are representative of all observations
|
|
imputation
|
converting incomplete data into a data set in which values for all people included in the study are present
|
|
imputation methods
|
-multiple copies of original data set made
-maximum likelihood and maximum pseudo-likelihood methods -weighted estimating eqn methods |
|
Multiple copies of the original data set made
|
-each with missing values randomly generated
-results from different data sets combined to take variability into account |
|
maximum likelihood and maximum pseudo-likelihood methods
|
take outcome and covariate distributions into account
|
|
weighted estimating eqn methods
|
weights for regression analysis provided by a model for missing data
|
|
We stratify to test for
|
interaction
|
|
Nominal
|
No numerical value, categorical scales, categorical data, categories mutually exclusive with no relationship between
|
|
Ordinal scale
|
a scale with choices that have inherent order
often for ratings of quality or agreement |
|
Numerical scale
|
a scale on which differences between numbers have meaning
|
|
independent variables
|
exposures, explanatory or predictor variables used to explain or predict
|
|
dependent variables
|
outcomes, responses, results
|
|
Correlation
|
the relationship between 2 variables
|
|
Coefficient of determine R^2
|
proportion of variation in the dependent variable associated with variation of change in the independent variable
|
|
spearmans rank correlation
|
-relationship between 2 ordinal characteristics or 1 ordinal and 1 numerical characteristic
-used with numerical data when observations are skewed, with outliers |
|
Regression
|
predicts the dependent variable by using a set of independent variables
|
|
What test do you use when you compare 1 nominal independent variable with respect to one numerical dependent variable
|
2 -sample t-test
|
|
Assumptions of statistical testing
|
-data normally distributed
-variances equal -if sample sizes equal, unequal variances not major effect on significance level -if not equal, adjust degrees of freedom, and use separate variance estimates -f-test can compare variances |
|
Nonparametric test
|
wilcoxon rank-sum teat
|
|
Use pooled variance estimate when
|
variances are equal, otherwise use separate estimates
|
|
ANOVA
|
compares means of three or more groups
tells you about overall status of differences among groups |
|
chi-square
|
-Compare proportions
-allow us to compare expected freq in each cell when frequency actually occurs -if relationship exists, two variables are said to be dependent |
|
index
|
simple accumulation of scores assigned to specific responses to individual items in the index
|
|
scales
|
constructed through assignment of scores to response patterns among several items in the scale
|
|
factor analysis
|
method to assess whether different items belong together in a scale
|
|
multitrait scaling analysis
|
-advanced technique measures how well groups of items hold together as scales
-used when looking at convergent and discriminant validity simultaneously -2+ traits measure bu two or more methods at same time |
|
factor analysis
|
-univariate procedure used to assess whether different items belong together in a scale
-a factor is a hypothetical trait thought to be measured by the items in a scale -computer algorithm tests possible combinations of items and determines how they vary together |
|
multivariable analysis |
examination of the distribution of cases on one dependent and more than one independent variable. |
|
What are the measures of central tendency |
1. mean 2. mode 3. median |
|
Measures of dispersion |
1. range 2. standard deviation |
|
univariate analysis |
examination of the distribution of cases on only one variable at a time |
|
Bivariate analysis |
examination of the distribution of cases on one dependent and one independent variable |
|
What graphs are better for visualizing trends? |
Line Graph |