Study your flashcards anywhere!

Download the official Cram app for free >

  • Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off

How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key


Play button


Play button




Click to flip

50 Cards in this Set

  • Front
  • Back
Elements of Statistical Analysis (crude data analysis)
1.Parameter estimation
a.Point estimate
b.Interval estimate

2.Hypothesis Testing
what does point estimate give?
an idea of the strength of the association
what should be considered in estimating a confidence interval?
how precision and variability affect our measure of association/impact
what should we look at in hypothesis testing?
the role that random error (chance) may have played in our measure of association/impact using p-values.
what is point estimate?
A single value computed from study data to represent the extent of the association or magnitude of effect

Determined by many factors, such as bias, and random error

Unlikely to equal the “true” population estimate
what is Interval Estimate?
Gives the variability of the point estimate, an idea of its precision

Provides a range of possible values around the point estimate having the designated confidence (say 90%) that it includes the true population parameter

Provides an estimate of the statistical variation, or random error, that underlies the point estimate.
when is the point estimate precise?
the interval is not wide.
when is the association is "significant"?
the CI does not include the null value
what is the Assumptions of interpreting the C.I.
-The only thing that would differ in hypothetical replications of the study would be the statistical, or change, element in the data.
-The variability in the data can be described adequately by statistical methods and biases are nonexistent.
what is the Hypothesis Testing about?
Quantitative assessment of a possible association between 2 variables using statistical procedures that ignore other variables

Is concerned with measuring the likelihood (probability) that a given set of results has been produced by chance

Focuses on a test hypothesis (generally a null hypothesis of no association)
what is the General purpose for calculating CI and hypothesis testing?
to estimate the statistical uncertainty around the point estimate in relation to the true population parameter (which is unknown)
Components of Hypothesis Testing


Type I error
: rejecting H0 when H0 is true, i.e. claiming an association when there is none.
 is usually set at 0.05 for power and sample size calculations.
Can think of  as the false positive rate.
Type II error
: accepting H0 when it is not true, i.e. failing to claim an association when it exists. Power = 1 - .
 is inversely related to the magnitude of the difference; ; and sample size.
Can think of  as the false negative rate.
P-value is:?
The statistic used for hypothesis testing.

The probability, assuming the null hypothesis is true, that study data will show an association by “chance” alone.

Computed from the data.

Judged by an arbitrary value, called the alpha () level - often set at 0.05.

As the p-value decreases, the likelihood that the null hypothesis will be rejected increases.

Is best thought of as a statistic that is a measure of the compatibility between the data and the null hypothesis – rather than a strict probability
what is Power?
Probability of rejecting the null hypothesis when it is false; finding that an association is significant when in truth it is significant.
Related to type II error: Power = 1 - 
Example: if type II error = 20%, then the power of the test is 80%
what is statistical power for?
to identify systematic effects in the presence of random variation.

the probability of identifying a systematic effect when it truly exists.
Hypothesis Testing – The Steps
1 State a test hypothesis
2 State an alternate hypothesis
3 Attempt to falsify the test hypothesis
Frequentists’ vs. Bayesian
Frequentist approach:
Point estimation
Hypothesis testing
Confidence interval

Bayesian approach:
Posterior distribution and predictive distribution estimations
Bayesian hypothesis testing (rarely used)
Bayesian credible interval
Summary: data analysis approach
Don’t use p < .05 mechanically
Increase sample size for greater precision
Report exact p-values
Report confidence intervals
Remember to always consider the clinical relevancy and biologic plausibility of the association under study.
what do you must know about your data?
Identify outliers, information that doesn’t make sense in relation to your study question
what is Baseline characteristics for?
Compare case-controls or exposed or non-exposed
Compare data on relevant covariates
what does Preliminary Analysis do?
-Crude analysis:
Chi-square tests
T-tests, Wilcoxon-Mann-Whitney, etc
ANOVA, etc
-Results of the first steps of your analysis are used to plan and to interpret more complicated analysis
-It is essential to look at the raw data before moving on to multivariate analysis
-Multivariate analysis:
Conditional logistic regression, linear regression, etc
what do we do to plan a research project?
Start with a general idea of a research questions
Conduct a literature review to determine data gaps
Formulate a research question – stated as a set of specific aims
Choose a study design
Calculate sample size
Design a statistical plan
Always consider potential limitations of your study and choose the specifics of your design and your analysis with those in mind.
what does Validity represent?
How well the survey, biological test, or study approximates what it purports to measure.

Absence of systematic error (or bias)
Not affected by sample size
what is Internal Validity?
The extent to which the investigator’s conclusions correctly describe what happens in the study sample
what is External Validity?
The extent to which the investigator’s conclusions are appropriate when applied to the universe outside the study
What does Reliability represent?
Strongly influenced by variability
Called random error
P-value p=0.01:
We have observed an association that is significantly different than the null hypothesis (RR=1) and the probability that an observed effect is actually due to chance is 1 in 100.
Confidence Interval 95%CI:
If we did this study 100 times (took 100 different samples from the target population) approximately 95% of the time the interval would cover the true population measure.
state the sources of error (random or systematic)
Error can be introduced by the…
Study observer/investigator
Study participant
Study instrument

During the process of…
Selection of study subjects
Measurement of disease and/or exposure
Analysis or interpretation of findings
what will get affected by Random error?
what will get affected by Systematic error?
How do we prevent threats to validity (systematic error) in our research?
1) Study design: Minimize Bias
(more on this in upcoming lectures)

2) Study implementation:
Quality Assurance & Quality Control

3) Use “validated tools” (best if validated in your population)
Quality Assurance Activities (before data collection starts):
Development or identification of validated data collection instruments
Development of Manual of Procedures (outlines standardized data collection procedures)
Staff Training
Quality Control Activities (after data collection starts):
Field Observation
Validity Studies *

During the study, validity studies can be used to assess performance of an already
validated tool. Remember, if the tool has not already been validated, this should be done in a pilot study before your main study
what are the Assessing Validity of Measurement Tools for Categorical Variables?
Sensitivity: The ability of a test to identify correctly those who have the disease (or characteristic) of interest (a/(a+c))

Specificity: The ability of a test to identify correctly those who do not have the disease (or characteristic) of interest (d/(b+d))
How to increasing Reliability : (the precision and reproducibility of data collected)?
1.Reduce intra-subject variability
-Repeated Measurements
-Standardized data collection times

2. Reduce inter-observer variability
-Standardized diagnostic criteria, tests, and instruments

3. Increase sample size
how to assess Reliability?
- Inter-rater
% agreement, kappa statistic

- Internal consistency
Kuder-Richardson20 , Cronbach’s coefficient alpha

- Test-retest
Quantified by correlation co-efficient

*See book for more examples*
how to assess agreement between observers, instruments?
1.Percent (observed) agreement
2.Kappa measure
give an example for each one
what do values of kappa mean (range from –1 to 1)?
If kappa = 0, observed agreement same as chance alone
If kappa < 0, observed agreement worse than by chance alone
If kappa = 1, observed agreement = 100% (perfect!)
what do values of kappa mean in medical research?
 > 0.75 excellent
0.40 <  < 0.75 good
0 <  < 0.40 marginal/poor
what does precision affect?
what does Confidence Interval affect?
what does Systematic Error affect?
what does Reproducibility affect?
what does Bias affect?
what does Random Error affect?
what does Sample size affect?
What does Confounding affect?