 Shuffle Toggle OnToggle Off
 Alphabetize Toggle OnToggle Off
 Front First Toggle OnToggle Off
 Both Sides Toggle OnToggle Off
 Read Toggle OnToggle Off
Reading...
How to study your flashcards.
Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key
Up/Down arrow keys: Flip the card between the front and back.down keyup key
H key: Show hint (3rd side).h key
A key: Read text to speech.a key
Play button
Play button
50 Cards in this Set
 Front
 Back
Elements of Statistical Analysis (crude data analysis)

1.Parameter estimation
a.Point estimate b.Interval estimate 2.Hypothesis Testing 

what does point estimate give?

an idea of the strength of the association


what should be considered in estimating a confidence interval?

how precision and variability affect our measure of association/impact


what should we look at in hypothesis testing?

the role that random error (chance) may have played in our measure of association/impact using pvalues.


what is point estimate?

A single value computed from study data to represent the extent of the association or magnitude of effect
Determined by many factors, such as bias, and random error Unlikely to equal the “true” population estimate 

what is Interval Estimate?

Gives the variability of the point estimate, an idea of its precision
Provides a range of possible values around the point estimate having the designated confidence (say 90%) that it includes the true population parameter Provides an estimate of the statistical variation, or random error, that underlies the point estimate. 

when is the point estimate precise?

the interval is not wide.


when is the association is "significant"?

the CI does not include the null value


what is the Assumptions of interpreting the C.I.

The only thing that would differ in hypothetical replications of the study would be the statistical, or change, element in the data.
The variability in the data can be described adequately by statistical methods and biases are nonexistent. 

what is the Hypothesis Testing about?

Quantitative assessment of a possible association between 2 variables using statistical procedures that ignore other variables
Is concerned with measuring the likelihood (probability) that a given set of results has been produced by chance Focuses on a test hypothesis (generally a null hypothesis of no association) 

what is the General purpose for calculating CI and hypothesis testing?

to estimate the statistical uncertainty around the point estimate in relation to the true population parameter (which is unknown)


Components of Hypothesis Testing

Error
Pvalue Power 

Type I error

: rejecting H0 when H0 is true, i.e. claiming an association when there is none.
is usually set at 0.05 for power and sample size calculations. Can think of as the false positive rate. 

Type II error

: accepting H0 when it is not true, i.e. failing to claim an association when it exists. Power = 1  .
is inversely related to the magnitude of the difference; ; and sample size. Can think of as the false negative rate. 

Pvalue is:?

The statistic used for hypothesis testing.
The probability, assuming the null hypothesis is true, that study data will show an association by “chance” alone. Computed from the data. Judged by an arbitrary value, called the alpha () level  often set at 0.05. As the pvalue decreases, the likelihood that the null hypothesis will be rejected increases. Is best thought of as a statistic that is a measure of the compatibility between the data and the null hypothesis – rather than a strict probability 

what is Power?

Probability of rejecting the null hypothesis when it is false; finding that an association is significant when in truth it is significant.
Related to type II error: Power = 1  Example: if type II error = 20%, then the power of the test is 80% 

what is statistical power for?

to identify systematic effects in the presence of random variation.
the probability of identifying a systematic effect when it truly exists. 

Hypothesis Testing – The Steps

1 State a test hypothesis
2 State an alternate hypothesis 3 Attempt to falsify the test hypothesis 

Frequentists’ vs. Bayesian

Frequentist approach:
Point estimation Hypothesis testing Confidence interval Bayesian approach: Posterior distribution and predictive distribution estimations Bayesian hypothesis testing (rarely used) Bayesian credible interval 

Summary: data analysis approach

Don’t use p < .05 mechanically
Increase sample size for greater precision Report exact pvalues Report confidence intervals Remember to always consider the clinical relevancy and biologic plausibility of the association under study. 

what do you must know about your data?

Identify outliers, information that doesn’t make sense in relation to your study question


what is Baseline characteristics for?

Compare casecontrols or exposed or nonexposed
Compare data on relevant covariates 

what does Preliminary Analysis do?

Crude analysis:
Chisquare tests Ttests, WilcoxonMannWhitney, etc ANOVA, etc Results of the first steps of your analysis are used to plan and to interpret more complicated analysis It is essential to look at the raw data before moving on to multivariate analysis Multivariate analysis: Conditional logistic regression, linear regression, etc 

what do we do to plan a research project?

Start with a general idea of a research questions
Conduct a literature review to determine data gaps Formulate a research question – stated as a set of specific aims Choose a study design Calculate sample size Design a statistical plan Always consider potential limitations of your study and choose the specifics of your design and your analysis with those in mind. 

what does Validity represent?

How well the survey, biological test, or study approximates what it purports to measure.
Absence of systematic error (or bias) Not affected by sample size 

what is Internal Validity?

The extent to which the investigator’s conclusions correctly describe what happens in the study sample


what is External Validity?

The extent to which the investigator’s conclusions are appropriate when applied to the universe outside the study


What does Reliability represent?

Strongly influenced by variability
Called random error 

Pvalue p=0.01:

We have observed an association that is significantly different than the null hypothesis (RR=1) and the probability that an observed effect is actually due to chance is 1 in 100.


Confidence Interval 95%CI:

If we did this study 100 times (took 100 different samples from the target population) approximately 95% of the time the interval would cover the true population measure.


state the sources of error (random or systematic)

Error can be introduced by the…
Study observer/investigator Study participant Study instrument During the process of… Selection of study subjects Measurement of disease and/or exposure Analysis or interpretation of findings 

what will get affected by Random error?

precision(RELIABILITY)


what will get affected by Systematic error?

VALIDITY


How do we prevent threats to validity (systematic error) in our research?

1) Study design: Minimize Bias
(more on this in upcoming lectures) 2) Study implementation: Quality Assurance & Quality Control 3) Use “validated tools” (best if validated in your population) 

Quality Assurance Activities (before data collection starts):

Development or identification of validated data collection instruments
Development of Manual of Procedures (outlines standardized data collection procedures) Staff Training 

Quality Control Activities (after data collection starts):

Field Observation
Validity Studies * *note* During the study, validity studies can be used to assess performance of an already validated tool. Remember, if the tool has not already been validated, this should be done in a pilot study before your main study 

what are the Assessing Validity of Measurement Tools for Categorical Variables?

Sensitivity: The ability of a test to identify correctly those who have the disease (or characteristic) of interest (a/(a+c))
Specificity: The ability of a test to identify correctly those who do not have the disease (or characteristic) of interest (d/(b+d)) 

How to increasing Reliability : (the precision and reproducibility of data collected)?

1.Reduce intrasubject variability
Repeated Measurements Standardized data collection times 2. Reduce interobserver variability Standardized diagnostic criteria, tests, and instruments 3. Increase sample size 

how to assess Reliability?

 Interrater
% agreement, kappa statistic  Internal consistency KuderRichardson20 , Cronbach’s coefficient alpha  Testretest Quantified by correlation coefficient *See book for more examples* 

how to assess agreement between observers, instruments?

1.Percent (observed) agreement
2.Kappa measure give an example for each one 

what do values of kappa mean (range from –1 to 1)?

If kappa = 0, observed agreement same as chance alone
If kappa < 0, observed agreement worse than by chance alone If kappa = 1, observed agreement = 100% (perfect!) 

what do values of kappa mean in medical research?

> 0.75 excellent
0.40 < < 0.75 good 0 < < 0.40 marginal/poor 

what does precision affect?

Reliability


what does Confidence Interval affect?

Reliability


what does Systematic Error affect?

Validity


what does Reproducibility affect?

Reliability


what does Bias affect?

Validity


what does Random Error affect?

Reliability


what does Sample size affect?

Reliability


What does Confounding affect?

Validity
