Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
50 Cards in this Set
- Front
- Back
Elements of Statistical Analysis (crude data analysis)
|
1.Parameter estimation
a.Point estimate b.Interval estimate 2.Hypothesis Testing |
|
what does point estimate give?
|
an idea of the strength of the association
|
|
what should be considered in estimating a confidence interval?
|
how precision and variability affect our measure of association/impact
|
|
what should we look at in hypothesis testing?
|
the role that random error (chance) may have played in our measure of association/impact using p-values.
|
|
what is point estimate?
|
A single value computed from study data to represent the extent of the association or magnitude of effect
Determined by many factors, such as bias, and random error Unlikely to equal the “true” population estimate |
|
what is Interval Estimate?
|
Gives the variability of the point estimate, an idea of its precision
Provides a range of possible values around the point estimate having the designated confidence (say 90%) that it includes the true population parameter Provides an estimate of the statistical variation, or random error, that underlies the point estimate. |
|
when is the point estimate precise?
|
the interval is not wide.
|
|
when is the association is "significant"?
|
the CI does not include the null value
|
|
what is the Assumptions of interpreting the C.I.
|
-The only thing that would differ in hypothetical replications of the study would be the statistical, or change, element in the data.
-The variability in the data can be described adequately by statistical methods and biases are nonexistent. |
|
what is the Hypothesis Testing about?
|
Quantitative assessment of a possible association between 2 variables using statistical procedures that ignore other variables
Is concerned with measuring the likelihood (probability) that a given set of results has been produced by chance Focuses on a test hypothesis (generally a null hypothesis of no association) |
|
what is the General purpose for calculating CI and hypothesis testing?
|
to estimate the statistical uncertainty around the point estimate in relation to the true population parameter (which is unknown)
|
|
Components of Hypothesis Testing
|
Error
P-value Power |
|
Type I error
|
: rejecting H0 when H0 is true, i.e. claiming an association when there is none.
is usually set at 0.05 for power and sample size calculations. Can think of as the false positive rate. |
|
Type II error
|
: accepting H0 when it is not true, i.e. failing to claim an association when it exists. Power = 1 - .
is inversely related to the magnitude of the difference; ; and sample size. Can think of as the false negative rate. |
|
P-value is:?
|
The statistic used for hypothesis testing.
The probability, assuming the null hypothesis is true, that study data will show an association by “chance” alone. Computed from the data. Judged by an arbitrary value, called the alpha () level - often set at 0.05. As the p-value decreases, the likelihood that the null hypothesis will be rejected increases. Is best thought of as a statistic that is a measure of the compatibility between the data and the null hypothesis – rather than a strict probability |
|
what is Power?
|
Probability of rejecting the null hypothesis when it is false; finding that an association is significant when in truth it is significant.
Related to type II error: Power = 1 - Example: if type II error = 20%, then the power of the test is 80% |
|
what is statistical power for?
|
to identify systematic effects in the presence of random variation.
the probability of identifying a systematic effect when it truly exists. |
|
Hypothesis Testing – The Steps
|
1 State a test hypothesis
2 State an alternate hypothesis 3 Attempt to falsify the test hypothesis |
|
Frequentists’ vs. Bayesian
|
Frequentist approach:
Point estimation Hypothesis testing Confidence interval Bayesian approach: Posterior distribution and predictive distribution estimations Bayesian hypothesis testing (rarely used) Bayesian credible interval |
|
Summary: data analysis approach
|
Don’t use p < .05 mechanically
Increase sample size for greater precision Report exact p-values Report confidence intervals Remember to always consider the clinical relevancy and biologic plausibility of the association under study. |
|
what do you must know about your data?
|
Identify outliers, information that doesn’t make sense in relation to your study question
|
|
what is Baseline characteristics for?
|
Compare case-controls or exposed or non-exposed
Compare data on relevant covariates |
|
what does Preliminary Analysis do?
|
-Crude analysis:
Chi-square tests T-tests, Wilcoxon-Mann-Whitney, etc ANOVA, etc -Results of the first steps of your analysis are used to plan and to interpret more complicated analysis -It is essential to look at the raw data before moving on to multivariate analysis -Multivariate analysis: Conditional logistic regression, linear regression, etc |
|
what do we do to plan a research project?
|
Start with a general idea of a research questions
Conduct a literature review to determine data gaps Formulate a research question – stated as a set of specific aims Choose a study design Calculate sample size Design a statistical plan Always consider potential limitations of your study and choose the specifics of your design and your analysis with those in mind. |
|
what does Validity represent?
|
How well the survey, biological test, or study approximates what it purports to measure.
Absence of systematic error (or bias) Not affected by sample size |
|
what is Internal Validity?
|
The extent to which the investigator’s conclusions correctly describe what happens in the study sample
|
|
what is External Validity?
|
The extent to which the investigator’s conclusions are appropriate when applied to the universe outside the study
|
|
What does Reliability represent?
|
Strongly influenced by variability
Called random error |
|
P-value p=0.01:
|
We have observed an association that is significantly different than the null hypothesis (RR=1) and the probability that an observed effect is actually due to chance is 1 in 100.
|
|
Confidence Interval 95%CI:
|
If we did this study 100 times (took 100 different samples from the target population) approximately 95% of the time the interval would cover the true population measure.
|
|
state the sources of error (random or systematic)
|
Error can be introduced by the…
Study observer/investigator Study participant Study instrument During the process of… Selection of study subjects Measurement of disease and/or exposure Analysis or interpretation of findings |
|
what will get affected by Random error?
|
precision(RELIABILITY)
|
|
what will get affected by Systematic error?
|
VALIDITY
|
|
How do we prevent threats to validity (systematic error) in our research?
|
1) Study design: Minimize Bias
(more on this in upcoming lectures) 2) Study implementation: Quality Assurance & Quality Control 3) Use “validated tools” (best if validated in your population) |
|
Quality Assurance Activities (before data collection starts):
|
Development or identification of validated data collection instruments
Development of Manual of Procedures (outlines standardized data collection procedures) Staff Training |
|
Quality Control Activities (after data collection starts):
|
Field Observation
Validity Studies * *note* During the study, validity studies can be used to assess performance of an already validated tool. Remember, if the tool has not already been validated, this should be done in a pilot study before your main study |
|
what are the Assessing Validity of Measurement Tools for Categorical Variables?
|
Sensitivity: The ability of a test to identify correctly those who have the disease (or characteristic) of interest (a/(a+c))
Specificity: The ability of a test to identify correctly those who do not have the disease (or characteristic) of interest (d/(b+d)) |
|
How to increasing Reliability : (the precision and reproducibility of data collected)?
|
1.Reduce intra-subject variability
-Repeated Measurements -Standardized data collection times 2. Reduce inter-observer variability -Standardized diagnostic criteria, tests, and instruments 3. Increase sample size |
|
how to assess Reliability?
|
- Inter-rater
% agreement, kappa statistic - Internal consistency Kuder-Richardson20 , Cronbach’s coefficient alpha - Test-retest Quantified by correlation co-efficient *See book for more examples* |
|
how to assess agreement between observers, instruments?
|
1.Percent (observed) agreement
2.Kappa measure give an example for each one |
|
what do values of kappa mean (range from –1 to 1)?
|
If kappa = 0, observed agreement same as chance alone
If kappa < 0, observed agreement worse than by chance alone If kappa = 1, observed agreement = 100% (perfect!) |
|
what do values of kappa mean in medical research?
|
> 0.75 excellent
0.40 < < 0.75 good 0 < < 0.40 marginal/poor |
|
what does precision affect?
|
Reliability
|
|
what does Confidence Interval affect?
|
Reliability
|
|
what does Systematic Error affect?
|
Validity
|
|
what does Reproducibility affect?
|
Reliability
|
|
what does Bias affect?
|
Validity
|
|
what does Random Error affect?
|
Reliability
|
|
what does Sample size affect?
|
Reliability
|
|
What does Confounding affect?
|
Validity
|