• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/125

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

125 Cards in this Set

  • Front
  • Back
Cross-sectional studies
and diseases with long duration
Will identify high proportion of prevalent cases with
long duration

People with short disease duration may not be
identified as diseased

If disease duration is associated with exposure,
then results can be biased
*Example: Those with severe emphysema (outcome) are
more likely to smoke (exposure) and have higher case
fatality rate (shorter disease duration) than those with less
severe emphysema.
Internal Validity
Was the study well done?
Are the findings valid?

Need to consider…
- If there are major methodological problems
- If findings could be due to bias,
confounding, random error

**Important – Need to establish sound internal
validity before you consider generalizing the
results beyond the study population
External Validity
Aka “generalizability” to target population

To what extent are the participants you have
studied representative of all people with the
outcome of interest?

Need to examine…
- Who did not participate in the study
- Characteristics of study participants that might
preclude you from generalizing the study results
to others who were not in the study
Clinical Trial
Controlled study that prospectively evaluates
the effect of an allocated exposure (i.e.,
intervention) on the outcome of interest

Effects in which we’re interested: Safety,
efficacy, effectiveness

Considered “gold standard” of epi studies
Examples of Exposures and Outcomes in
Epidemiologic Research
Exposures:
Medications
• Surgical Procedures
• Behavior Modification
• Screening Programs
• Traits and Behaviors
• Genetic Variants
• Infectious Agents
• Environmental Toxins

Outcomes
• Death
• Disease
• Subclinical Indicators
of Disease
• Health-Related Traits
• Quality of Life
• Physical Function
• Costs
Key Parameters of Clinical Trials
Individual is unit of observation
Experimental design

Follow participants over time
-Collect data from at least two time
points (e.g., before exposure, after
exposure)
When Can a Clinical Trial be Conducted?
Clinical trials are justified when uncertainty
exists regarding the effectiveness of a
treatment (aka, EQUIPOISE)

EQUIPOISE: Legitimate uncertainty or
indecision as to choice or course of action…
because of an unknown balance of benefits
and risks

The researcher must believe that…
(1) what a study proposes to accomplish has
an excellent chance of being helpful (i.e.,
will contribute to generalized knowledge)
and
(2) he/she must have justified doubt about
the relative benefits of the comparison
treatment (which may be the “standard of
care” treatment)
When Clinical Trials Are Impossible
(or Nearly Impossible)
Adverse Exposures (e.g., cigarettes, other
toxins)

Rare Outcomes (e.g., Reye’s Syndrome)

Intervention Already in Wide Use (e.g.,
intensive care unit (ICU) medical care)
Basic Protocol in a Clinical Trial
1. Obtain approval of Institutional Review
Board (IRB)
2. Enroll participants
3. Gather “baseline” data from participants
4. Allocate exposure to participants
5. Follow-up participants to collect data on
outcome
6. Conduct data analyses
7. Report findings
Enrollment of Study Population
Enroll a study population with specific
characteristics designed to ensure success of the
trial and safety of participants

Careful application of inclusion/exclusion criteria

Such a study sample will yield the “cleanest”
results (high internal validity), but may
compromise generalizability (low external validity)
General Inclusion Criteria
• Able to provide informed consent
• At high risk for main outcome
• At low risk for adverse side effects
• No confounding medical conditions
• Likely to adhere to treatment and data
collection/study procedures
Non-Randomized Allocation of Exposure
Clinical Judgment
e.g., Good surgical candidate vs. Not

Participant Preference
e.g., Wants surgery vs. Doesn’t

Alternating
First Participant --> Treatment; Next --> Control

Day of Week
MWF --> treatment; TuThS --> Control

ID Numbers
Last Digit of SSN Odd --> Treatment; Even --> Control
Why Randomize Exposure Allocation?
Ensure that exposure assignment is unbiased

Produce similar groups at baseline by known
and unknown factors
Goal: any difference between the groups at
the end of the study will be the result of the
exposure / treatment / intervention

Minimizes the threat of selection bias

Avoids confounding by indication
Randomized Assignment
Unstratified by any variables
– Assignment is completely random
– Balanced in the long run, but may be
unbalanced in the short run

Stratified by key variables
–Ensures balance within subgroups defined by
key variables before randomization
–Stratification variable should be strongly
related to outcome (e.g., gender, risk level)
Confounding by Indication in Observational Studies
A bias when patients with the worst prognosis
are allocated preferentially to a particular
treatment.

High risk hypertensive patients are more likely to have
adverse outcomes.

High risk hypertensive patients are more likely to be
prescribed calcium channel blockers (than other drugs
hypertensive drugs).

Observational studies show that calcium channel
blockers are associated with more adverse outcomes
Factorial Design
Potentially economical way to test two
treatments simultaneously, if their modes of
action are independent

OR

Method to test for treatment synergy
- Is the effect of the combined treatment different
than expected based on the effects of the treatments
alone?
“Cross Over"
Crossing from one treatment group to the other

Unplanned crossover:
treatment non-adherence
procedures/protocol should be designed to minimize

Planned crossover design:
administration of treatments one after the other in random (or specified) order
treatment may be followed by a “washout” period
Planned “Cross Over”
Each participant serves as his/her own control
- creates comparability between treatment groups

Feasible only if…
- Outcomes are recurrent, and
- No “carryover” treatment effect after “washout” period

Randomize order of treatments
What is a placebo?
A placebo is an inactive or inert intervention or
agent that is given as a substitute for the
treatment and where the participant is not
informed he/she is receiving the active or
inactive intervention
Why Use a Placebo (aka Mask Participant)?
Masking subjects to exposure assignment
minimizes information bias

Equalizes psychological effects of an
“intervention” (aka placebo effect)
The Placebo Effect
The effect that is produced by a placebo.

The placebo effect is often measured by
comparison of the effect observed in patients
receiving the placebo with the effect observed
in patients receiving the active treatment.
Possible Differential Placebo Effects
Physical qualities of a pill, e.g. red vs. blue, big
vs. small, branded vs. not

Device/high tech intervention vs. not

Injection vs. oral placebo

Headache relief was 6% higher in groups
that received subcutaneous (i.e., an
injection) placebo versus oral placebo (de
Craen et al, J Neurol 2000)
Who to Mask/Blind in the Study and Why
Participants: Quantify placebo effects
Physicians: Uniform care apart from study
Data Collectors: Uniform outcome ascertainment
Data Analysts: Reduce threat of analytic bias
Partial Masking
In some circumstances masking of participants
and/or physicians may be impossible or unethical
(Surgery, behavior modification)

In this setting, others can generally still be
masked:
Data collectors
Adjudicators
Laboratory measurements
Data analysts
Ascertainment of Outcomes
Devise clear, a priori outcome definition(s)

Prefer “harder” to “softer” outcomes

Don’t forget about adverse effects/events

Mask to the fullest extent possible

Standardize methods and equipment

Train and certify data collectors

Conduct on-going quality assurance
Non-compliance
Study Non-compliance:
o Persons who stop participating in study
o Do not adhere to protocol

Treatment Non-compliance: poop
o Persons who do not take all of assigned
treatment (i.e., poor adherence)
Approaches to Non-compliance
Run-in period / pilot study – randomize subjects
after a trial period assessing compliance

Monitor noncompliance:
- Interview patients, count pills
- Medication bottle devices
- Blood or urine tests
- Directly observed treatment

In the setting of non-compliance, the observed
effect will likely be smaller than the true effect
CT Data Analysis Approach
1. Intention to Treat (ITT)

2. Treatment received
o Observational
o No longer have benefits of randomization

3. Subgroup analyses
o Small numbers
o Hard to determine that treatment effect differs
by sub-groups
Intention to Treat (ITT) Approach
Analysis by assigned treatment regardless of
the observed course of treatment

Maintains initial balance from randomization

Highlights problems from adverse effects

Conservative approach

Strongly recommended as primary approach
Number needed to treat (NNT)
Number of patients who would need to be treated
to prevent one outcome

NNT = 1 / (outcome frequency in untreated group
– outcome frequency in treated group)

Small NNT is good

Estimates often presented with 95% confidence
intervals
Number needed to harm (NNH)
Number of patients who would need to be treated
to cause one patient to be harmed (by treatmentrelated adverse events or side effects)

NNH = 1 / (adverse event frequency in treated
group – adverse event frequency in untreated
group)

Large NNH is good

Estimates often presented with 95% confidence
intervals
Safety and Stopping
“Stopping rule”

A rule set before the start of the trial that
specifies a limit for the observed treatment
difference for the primary outcome which, if
exceeded, automatically leads to the termination
of the treatment or control arm (depending on
direction of the difference)
When to stop a clinical trial before
its scheduled end?
1. Clear evidence of benefit
2. Clear evidence of harm

--> Importance of plans to monitor the
progress of a trial
Monitoring Progress (Benefits/Harms)
in a Clinical Trial
Data Safety and Monitoring Board (DSMB):

Independent committee with responsibilities
to periodically review accumulated data for
evidence of benefit or harm (e.g., adverse
events) from the treatment

Responsible for making recommendations
for modifying the trial, including stopping
early, if appropriate
Why CTs Can Be Difficult
Hard to find and recruit the right people

Great responsibility on the investigator(s), need
for tremendous documentation, cost

May take years for outcomes to develop

People are free to do as they please:
- Some assigned to treatment don’t adhere
- Some assigned to control seek treatment
- Some drop out of the trial completely (loss-tofollow-up)
Advantages of CTs
"Gold standard” (Randomization) of epi studies

Designed to minimize bias

“Highest quality evidence available”

Results may be combined into systematic
reviews
NNT and herd immunity
Number needed to vaccinate (NNV)

If vaccine coverage rates are high, vaccination
will produce positive herd-immunity effects.
Hence, if high coverage rates are attained, the
population-based NNV will likely be lower.

On the other hand, if coverage rates are low,
accounting for herd-immunity effects will have
little or no effect on estimates of NNV
Limitations of CTs
Cost

Limited external validity
- Country, patient characteristics, study
procedures, outcome measures

Time to conduct and to publish findings

Difficult to study rare events

Difficult to study distant events

Narrowing of the studied question
Phases in Clinical Trials
I Evaluate safety, dosage-->10-20 healthy volunteers -->Unexpected side effects may occur

II Evaluate efficacy --> About 200 patients-->Most drugs fail in Phase II due to being less efficacious than anticipated

III Evaluate effectiveness--> More than 1,000 patients-->Likelihood to detect rare side effects increases with number of patients

IV Evaluate long-term safety and effectiveness--> 1,000s of patients, “real life” evaluation outside of research environment-->Previously untested groups may show
adverse reactions, postmarketing surveillance
Definition of Cohort Study
Observational epidemiologic study that follows
groups with common characteristics over time

Terms associated with cohort studies: followup, incidence, longitudinal study

Participants defined by exposure status, then
followed for outcomes of interest
Key Parameters of Cohort Studies
Individual is unit of observation

Observational design

Follow participants over time
-Collect data from at least two time points

Participants selected based on exposure
status, and all are “at risk” for the main
outcome at baseline
When is a Cohort Study Warranted?
Good evidence of an association of the disease
with a certain exposure

Exposure is rare, but incidence of disease
among exposed is high

Time between exposure and disease is short

Attrition of study population can be minimized
Types of Populations in Cohort Studies
Open or dynamic:
Changeable characteristic (e.g.,smoking)
Members come and go; losses may occur
Incidence rate

Fixed:
Irrevocable event (e.g., birth of child)
Does not gain members; losses may occur
Incidence Rate

Closed:
Irrevocable event (e.g.,natural disaster)
Does not gain members; no losses occur (observation period is short)
Cumulative incidence
Timing of Cohort Studies
Prospective – Looking forward in time
Participants grouped based on past or current
exposures and followed forward for outcome

Retrospective – Looking back in time
Both exposures and outcomes have already
occurred when study begins, data collection is
based on existing records (historical)

Ambidirectional – Looking both forward and
back in time
Basic Protocol in a Prospective Cohort Study
1. Obtain approval of Institutional Review
Board (IRB)
2. Enroll participants grouped by exposure
status
3. Gather “baseline” data from participants
4. Follow participants to collect data on
outcomes
5. Conduct data analyses
6. Report findings
Selection of Exposed Group (Cohort)
Depends on hypothesis, exposure frequency,
feasibility considerations

Special cohorts for rare exposures
-Uncommon workplace exposures, unusual diets,
uncommon lifestyles
-Occupational groups, religious groups

General cohorts for common exposures
-Professional groups or well defined geographic areas
Facilitate follow-up and accurate ascertainment of
outcomes
Selection of Unexposed Group (Cohort)
Internal comparison group
-Unexposed members of same cohort, ideal
option

General population
-Second best option, based on preexisting
population data on disease morbidity and
mortality

Comparison cohort
-Members of another cohort, least desirable
option, results can be difficult to interpret
Characterizing the Exposure (Cohort)
Exposed or index group vs. Unexposed or
referent or comparison group

Must specify minimum amount of exposure to
qualify as “exposed”

May divide exposed group into levels
-Example: High, medium, low
-Detection of dose response relationship
Things Happen Over Time
Affecting Exposures and Outcomes (Cohort)
Aging

Environmental exposures
-Lifestyle
-External environments

Disease identification
Cohort Sources of Information
Interviews

Medical and employment records

Direct physical exams

Lab tests and biological specimens

Environmental monitoring

And remember…
Each source has advantages and disadvantages
Need comparable procedures for data collection in
exposed and unexposed groups, including standard
outcome definitions and masking
Follow-up and Outcome Assessment (Cohort)
Exposed and unexposed groups monitored
for outcome(s) over time during follow-up

More than one outcome is usually studied

Follow-up can range from a few hours to a
several decades

Difficult to maintain contact with participants

Aim to have a high “follow-up rate” by
minimizing losses to follow-up
Maximizing Participant Retention (Cohort)
Losses to follow-up (LTF) decrease sample size
LTF may be more like to develop outcome!

Collection of data at baseline on participant,
friends, relatives, physicians

Regular contact via mail, phone, home visits

If possible LTF – Then, “Address Correction
Requested,” contacts provided at baseline,
directories, national registries, commercial
companies
Cohort Study Data Analysis Approach
Primary objective – Compare disease
occurrence in exposed and unexposed groups
-->Incidence rates, cumulative incidence

Person-time

Induction period – Interval between action of a
cause (e.g., exposure) and disease onset

Latent period – Interval between disease onset
and clinical diagnosis
Disadvantages of Cohort Studies
Inefficient for rare outcomes

Poor info on exposures and other key variables
(retrospective)

Expensive and time consuming (particularly
prospective)

Inefficient for diseases with long induction and
latent periods (prospective)

More vulnerable to bias (retrospective)
Advantages of Cohort Studies
Efficient for rare exposures

Good information on exposures (prospective)

Can evaluate multiple effects of an exposure

Efficient for diseases with long induction and
latent periods (retrospective)

Less vulnerable to bias (prospective)

Can directly measure disease incidence or risk

Clear temporal relationship between exposure
and outcome (prospective)
Information Bias in Cohort Studies
Information bias is a flaw in collecting or
measuring exposure or outcome data that
results in different quality/accuracy of
information between comparison groups

In retrospective cohort studies, you rely on
past records

What if past records differed in quality and
extent of info between exposed and
unexposed persons?
Definition of a Case-Control Study
Observational epidemiologic study of persons
with the outcome of interest (“cases”) and
without (“controls”) that examines the presence
of particular attributes (“exposures”) in the two
groups

Participants defined by outcome status, then
exposures of interest are assessed

Highly efficient study design
Key Parameters of Case-Control Studies
Individual is unit of observation

Observational design

No follow-up of participants over time
(i.e., the investigator does not directly
collect data from the participant over time)
- Collect data at one time point

Participants selected based on outcome
status, then exposures are assessed
Types of Case-Control Studies
Population-based:
-Participants identified from within a source population
-No pre-existing study infrastructure
-Example: Inpatients at Johns Hopkins Hospital today

Nested:
-Source population is ongoing cohort study
-Benefits of cohort and case-control study designs
-Example: Participants in ALIVE cohort study
Basic Protocol in a
Case-Control Study
1. Obtain approval of Institutional Review
Board (IRB)
2. Identify and enroll cases and controls
3. Gather data from participants, records, etc
4. Conduct data analyses
5. Report findings
Selection of Cases (case control) : Case definition
Goal – accurate classification
Signs and symptoms
Physical and pathological exams
Results of diagnostic testing
Criteria used have implications for accuracy
of case definition
Best to use all available evidence
Disagreement about how to define disease,
disease definitions change over time
Selection of Cases: Sources of Case Identification(case control)
Hospital/clinic patient rosters
Death certificates
Special surveys (e.g., NHANES)
Special reporting systems (e.g., birth and/or death
registries)
Each source as advantages/disadvantages
Goal is to identify as many true cases as
possible (and as cheaply and quickly as
possible)
Selection of Cases
Incident versus prevalent cases(case control)
Incidence, if studying causes of disease
Prevalence, if study duration of disease
Might not have a choice, so prevalence
Selection of Cases: Complete versus partial case ascertainment(case control)
As long as source population can be defined, then…
- Do not have to include all cases in a population
-A subset of cases is appropriate
Selection of Controls
Individuals without the disease(case control)
A sample of the population that produced the
cases
AKA, “referent group” because it “refers to”
exposure distribution in the source population
Cases and controls represent the same base
population…
So, a member of the control group who gets the
disease would end up being a case in the study
Selected independent of exposure status!
Assumption (case control)
Cases and Controls Originate
From Same Hypothetical Source Population
Selection of Controls(case control)
Sources of Control Identification
Population (preferred, if available)
Hospital/clinic
Deceased individuals
Case’s friends, spouse, and relatives
Each source has advantages/disadvantages
Remember - If a member of the control group
actually had the disease, would he/she end up
as a case in the study?
Ratio of Controls to Cases
Can increase the statistical power of the study
to detect an association by increasing the size
of the control group

Up to a ratio of 4 controls : 1 case will increase
power

Beyond 4:1, not considered worthwhile due to
costs
Methods of Sampling Controls
in a Nested Case-Control Study
Each time a case occurs, a control is selected
from the “risk-set” of individuals remaining in the
cohort without the condition

Cumulative incidence sampling
-Risk-set is defined at the end of follow-up
among those without the outcome

Incidence density sampling
-Risk-set is defined at the time of the case
-Controls can be future cases!
Challenge in Case-Control Studies
Cases and controls may differ in characteristics or
exposures other than the one targeted for study
-Is study finding due to exposure, or due to differences
between cases and controls?

Solution via study design: Match cases and control
for factors about which you’re concerned

Matching: Process of selecting the controls so they
are similar to the cases in certain characteristics
(e.g., age, race, sex, etc)
Matching (Case Control)
Group matching (frequency matching)
Proportion of controls with a certain characteristic
is identical to the proportion of cases with the
same characteristic
# of controls may be less than # of cases

Individual matching (matched pairs)
For each case, at least one control is selected
who is similar to the case for the characteristic of
interest
Problems with Matching (Case Control)
Practical –A lot of matching may make it
impossible to find a suitable control

Conceptual – Once you match controls to cases
by a certain characteristic, then you cannot
study that characteristic in your analysis

So, only match on factors you are convinced are
risk factors for the disease (and you therefore
don’t need to investigate)
Sources of Exposure Information(Case Control)
Questionnaires: Face-to-face, telephone, self administered/ Can obtain info on may exposures; must be carefully designed and administered to elicit accurate info; expensive

Preexisting records: Administrative, medical, regulatory/ May be only available source of exposure info; avoids bias;
may be incomplete; may lack uniformity and details; inexpensive

Biomarkers: Levels in blood, urine, bone, toenails/ Estimate of internal dose; infrequently used because of
difficultly identifying valid and reliable markers of exposure to noninfectious agents; expensive
Limitations in Recall(Case Control)
Virtually all people are limited to varying
degrees in their ability to recall information

If this limitation affects all subjects in a study to
the same extent, misclassification may result

Generally leads to underestimation of the
association between exposure and outcome
(Case Control)Recall Bias
Differential recall of past exposures/events
between cases and controls
-Different by amount of recall
-Different by accuracy of recall

Can affect study findings on exposure and
outcome association
Case-Control Study
Data Analysis Approach
Challenge: Often, investigators do not know
the size of the total population that produced
the cases

Assumption: Cases and Controls Originate
From Same Hypothetical Source Population

So, we don’t know how many people were “at
risk” for becoming a case (i.e., we don’t know
the denominator), so we can’t calculate
incidence, prevalence, associated measures

But, we can calculate the odds!

Odds of event = probability (p) the event will
occur divided by the probability the event will
not occur = p / (1-p)
Disadvantages of Case-Control Studies
Inefficient for rare exposures

May have poor info on exposures because of
retrospective

Vulnerable to bias because of retrospective

Cannot establish temporal relationship between
exposure and disease
Advantages of Case-Control Studies
Efficiency
-Less time, less money than cohort studies,
experimental studies

Efficient for rare diseases

Efficient for disease with long induction and
latent periods

Can evaluate multiple exposures in relation to
outcome (so, good for diseases about which
little is known)
Measuring the Risk of Disease
Absolute risk = Incidence of disease
- No explicit comparison by exposure status

Measures of excess risk – Involves comparison
by exposure status
– Ratio measures (relative risk, prevalence
ratio, odds ratio)
– Difference measures (incidence difference,
prevalence difference)
What is a measure of association?
A quantity that expresses the strength or degree
of association (i.e. relationship) between
variables.

Association: Statistical dependence between
two or more events, characteristics, or other
variables.

In epi studies, the association in which we are
interested is between EXPOSURE and
OUTCOME

IMPORTANT:
An association may be fortuitous, spurious, or may
be produced by various other circumstances
The presence of an association does not
necessarily imply a causal relationship
When interpreting a measure of association… We need to evaluate:
1. Reference group - The choice of reference group
(‘referent’)
2. Direction - The direction of the estimated measure of
association provides information on the nature of the
influence of the exposure on the outcome
3. Magnitude - The magnitude of the estimated measure
of association provides information about the strength
of the relationship between the exposure and outcome
Approaches to Measuring Excess Risk
(i.e., Excess Incidence) in Measures of Assoc
1. Ratio of Risks:
Risk in Exposed/ Risk in Non-Exposed

2. Differences in Risk:
(Risk in Exposed) - (Risk in Non-Exposed)

Sound familiar from outbreak
investigation calculations?
Choice of the Reference Group
When comparing measures of disease frequency,
which group is the referent?

Depends on the research question!
-The choice may be arbitrary
-Often: Choose group with largest sample size
-Simple binary exposure: Reference = ‘Unexposed’

When reporting ratio measures of association in
tabular format, you typically see the reference group
noted by ‘1.0’ or ‘Reference’ or ‘REF’ in a table
Continuous Exposure: What is Reference?
Continuous exposure: Measured on interval scale
e.g., Diastolic Blood Pressure, CD4+ cell
counts, Age, Smoking Pack-years

What is the reference group?
-None!
-Measures of association reflect increased
disease frequency per unit increase in
exposure
-Need to be clear about units
Case-Control Study
Data Analysis Approach
Since, we don’t know how many people were
“at risk” for becoming a case (i.e., we don’t
know the denominator), we can’t calculate
incidence, prevalence, associated measures,
like relative risk and prevalence ratio

But, we can calculate the odds!

Odds of event = probability (p) the event will
occur divided by the probability the event will
not occur = p / (1-p)
When is the OR a good estimate of the RR?
When the cases are representative of all people
with the disease in the population from which the
cases were drawn, with regard to history of the
exposure.

When the controls are representative of all
people without the disease in the population from
which the cases were drawn, with regard to
history of exposure.

When the disease is not frequent (i.e., rare).
Attributable Risk For the Total Population
If the incidence in the total population is
unknown, it can be calculated if we know:
1. The incidence among exposed.
2. The incidence among unexposed.
3. The proportion of the total population that is
exposed.
Reviewing Public Health Surveillance
Passive – Routine reporting of disease cases
seen in health care facilities

Active - Special search to find disease cases

Sentinel – Disease-specific reporting systems
in defined catchment areas

Syndromic – Uses already existing healthrelated data that precede diagnosis, supplements existing surveillance methods
Basic Concepts of Disease Causation
Disease causation is multi-factorial

Pathogenesis is generally a multi-step
sequence that may or may not result in
clinical disease

Agent-host-environment model
Epidemiologic Triangle
Agents:
Biologic (e.g., bacteria, virus)
Chemical (e.g., poison, smoke)
Physical (e.g., injuries)
Nutritional (lack and/or excess)

Host (factors):
Age
Sex
Genetics
Personality (e.g., risk taking)
Health status (i.e., previous disease)
Immune status
Nutritional status
Education

Environment:
(physical)
Temperature
Crowding
Housing
Neighborhood
Water
Sewage
Food
Radiation
Air pollution
(Social)
Social support
Behavioral modeling (e.g., family,
peers, culture, media)
Politics
Economics
“Risk Factor”
An attribute associated with the occurrence of
health-related condition

- Aspect of personal behavior or lifestyle
- Environmental exposure
- Inborn or inherited characteristics
- Not necessarily causal

Several nuanced synonyms: predisposing
factor, risk marker, precursor, determinant
Three Essential Characteristics of a Cause
Association: a cause must occur together
with its effect; a statistical association must
exist between a cause and its putative effect.

Time order: a cause must precede its effect.

Direction: there is an asymmetrical
relationship between the cause and effect;
a change in the cause produces a change
in the effect; changing the effect does not
alter the cause.
Necessary Versus Sufficient Causes
Necessary cause = a cause that must
always precede an effect (A Necessary Cause
must always precede the effect)

Sufficient cause = a cause that always
produces an effect

Note: The effect need not be the sole result
of one cause.
Necessary Causes
If a disease is defined by the presence of an
agent, that agent is necessary by definition.

Example: Tuberculosis can only be caused by
the tubercle bacillus.

Contrast: Hepatitis can be caused by many
viruses, but Hepatitis C is caused only by the
Hepatitis C virus.
Any Given Cause May Be Necessary,
Sufficient, Both, or Neither
Necessary and sufficient = cause is always
present with disease; nothing but cause is
needed to result in disease
–Example: measles virus and measles

Necessary and not sufficient = cause is
always present with disease, but disease is
not always present with cause
–Example: HPV and cervical cancer

Not necessary and sufficient = cause may or
may not be present with disease, nothing but
cause is needed to result in disease
–Example: High-dose exposure to pesticides
or ionizing radiation and sterility in men

Not necessary and not sufficient = cause may or
may not be present with disease; if cause is
present with disease, then some additional
factor must also be present
–Example: sedentary lifestyle and coronary
heart disease
Necessary Conditions
“X is a necessary condition for Y” =
If we don't have X, then we won't have Y
OR
Without X, you won't have Y
To say that X is a necessary condition for Y
does not mean that X guarantees Y.
Sufficient Conditions
“X is a sufficient condition for Y” =
if we have X, we know that Y must follow
OR
X guarantees Y
Epidemiologic Guidelines for Establishing
a Cause-Effect Relationship
Temporal sequence
Strength of the association
Dose response relationship / biologic gradient
Consistency of the association / replication
Coherence (biologic plausibility)
Specificity of the association
Experiment (cessation of exposure)
Analogy
Consideration of alternate explanations
Temporal Sequence
Study designs that can establish the potential
“cause” (risk factor or treatment) precedes the
disease include:
-Clinical trial
-Cohort study

Study designs that cannot establish that the
potential “cause” preceded the disease include:
-Cross-sectional study
-Population-based case-control study
-Ecologic study
Strength of the Association
Measures of association include:
-Ratio measures
Examples: relative risk, odds ratio

Difference measures
Examples: risk difference, rate
difference, attributable risk

The stronger the association between the
potential risk factor and the disease, the less
likely is the association due to confounding.

But, it is possible on occasion for strong
associations to be produced by confounding,
biases, or chance variability
Probabilistic Causality
The strength of a causal relationship is
assessed by the magnitude of its measures
of association.

The greater the RR or OR, the closer the
cause is to being necessary and/or
sufficient.
Strength of the Association
Dose-Response Relationship
Does risk for disease increase
with the degree of exposure?
Exception: “threshold effect”
Consistency of the Association
“Has it been repeatedly observed by different
persons, in different places, circumstances,
and times?”

Consistency is persuasive only if the studies
use different architectures, methodologies, and
subject groups and still come up with the same
results.

Example: Heart disease and high blood
pressure
Coherence of the Association
(Biologic Plausibility)
“The whole thing should make biologic and
epidemiologic sense.”

Caution: This criterion is used to test
whether evidence in favor of a hypothesis is
plausible based on currently existing theory
and knowledge
Specificity of the Association
“The precision with which the occurrence of
one variable, to the exclusion of others, will
predict the occurrence of another, again to the
exclusion of others.”

But, the closer one gets to specificity, the
easier it is to detect the association of the
cause and the disease.

The extreme case would be a necessary
and sufficient cause for a disease.
Specificity of the Association:
NOT a valid criterion!
1. “causes” have multiple effects
2. Diseases have multiple causes
When assessing evidence, ask if the
observed association could be…
1. Due to chance?
2. Due to bias?
3. Due to confounding?
Confidence Interval (CI)
A computed interval that, upon repeated
sampling, has a given probability (e.g., 95%)
of containing the true value of a statistical
parameter (e.g., ratio, proportion, rate).



In other words…

For a 95% confidence interval, if a single
population is repeatedly sampled, then 95%
of the samples would capture the true value
of the population parameter.

Expresses the precision of the point estimate
- More narrow interval = more precision
- Less narrow interval = less precision
Calculated with predetermined significance
level, α (alpha), which is often set at 0.05
Two Principal Types of Bias
1. Selection Bias
2. Information Bias
Selection Bias
Error due to systematic differences in
characteristics between those who take part
in a study and those who do not

The problem is that the association between
exposure and outcome may differ between
those who participate in the study and those
who do no

The measure of association is distorted due
to procedures used to select subjects and
from factors that influence study
participation

Usually inferred, rather than observed
Selection Bias Types
Self-selection bias
• Selection of controls
– Healthy worker effect
• Post-entry exclusion bias
Information Bias
A flaw in collecting or measuring exposure
or outcome data that results in different
quality/accuracy of information between
comparison groups

Can result in distortion of the measure of
association
Information Bias Types
Misclassification
- Differential and non-differential with
respect to exposure and outcome
status
• Recall bias
• Reporting bias
• Interviewer bias
• Surveillance bias / biased follow-up
Bias can distort the measure of association
Away from the null (e.g., away from 1.0):
i.e., observe stronger than true association between
exposure and outcome
True RR = 1.5
Observed RR = 3.2

Toward the null (e.g., toward 1.0):
i.e., observe weaker than true association between
exposure and outcome
True OR = 2.6
Observed OR = 1.2
Distinguishing between random error (i.e.,
chance) and systematic error (i.e., bias)
Imagine that a given study could be
increased in size until it was infinitely large

Some errors would be reduced to zero; these
are the random errors

Other errors would not affected by increasing
the size of the study; these are systematic
errors or bias
Confounding
A situation in which the measure of association
is distorted because of the relationship
between the exposure and a third factor that
also influences the outcome.

It is a true phenomenon, and not an error in
the study.

Distortion in a measure of association due to a
third variable that:
1. Is associated with the exposure
2. Influences the outcome
3. Is not in the causal pathway (i.e., not an
intermediate step between exposure and
outcome)
Consequences of Failing to
Account for Confounding
In this example, ignoring the confounder (i.e.,
smoking) would have resulted in a “false
positive” finding of the association between
heavy alcohol consumption and MI.

The opposite can also be true, ignoring a
confounder can obscure a true association
between an exposure and an outcome,
leading us to conclude that there is no
association when an association truly exists.
Confounding can be controlled for…
1. In the study design
• Matching in a case-control study
• Randomization in a clinical trial
2. In the data analysis
• Stratification
• “Adjustment” (e.g., age adjustment)
• Multivariate regression models
BUT, we must have collected the data!
Interaction (i.e., Effect Modification)
If the size of the association between an
exposure and an outcome is changed or
modified by the level of a third variable,
interaction is said to be present
Interaction is also called “effect measure
modification”
Classic examples: age, immunization
Assessing Effect Modification
The presence or absence of an interaction
depends upon the measure of effect that is
being used, e.g., risk difference, relative risk

Note: If you detect effect modification, then
report it via stratum-specific estimates. Do
not “adjust” for it in your data analyses.
Model for Additive Effect
Combined Total Risk of A and B =
Baseline Risk
+ Attributable Risk (A)
+ Attributable Risk (B)

Combined Effect of A and B =
Attributable Risk (A)
+ Attributable Risk (B)
Model for Multiplicative Effect
)
Combined Total Risk of A and B =
Baseline Risk
x Relative Risk (A)
x Relative Risk (B)

Combined Effect of A and B =
Relative Risk (A) x Relative Risk (B)
Possible Types of Effect Modification
Antagonism: Combined effect less than
predicted by the model (negative interaction)

Synergism: Combined effect greater than
predicted by the model (positive interaction)
Comparison of
Confounding and Effect Modification
Confounding – Association between exposure and
outcome is distorted by a third variable related to
the exposure and outcome

Effect modification – The association between
exposure and outcome is modified by levels of a
third variable
Distinguishing between
confounding and effect modification
1. Make list of potential confounders and effect
modifiers (literature review, data collected)
2. Calculate “crude” measure of association for
exposure and outcome of interest
3. Stratify association by levels of potential
confounder or potential effect modifier
4. Compare crude vs. stratum-specific
associations…

If stratified associations are relatively similar
across strata AND different from crude, then you
have confounding

If stratified associations differ across strata AND
crude association seems to be weighed-average of
stratum-specific associations (i.e., crude measure
is between stratum-specific measures), then you
have effect modification