Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
89 Cards in this Set
- Front
- Back
Randomized controlled trial
|
Excellent for determining treatment efficacy
But treatment effectiveness may be lower outside the context of the trial |
|
probabilistic
|
based on probabilities, risk, odds, tendencies, or what is true on average
|
|
Odds ratio
|
used in cross-sectional and case-control studies
|
|
Relative risk
|
used in cohort studies and randomized controlled trials
|
|
Hazard ratio
|
used in time to event studies such as survival analysis
|
|
Random assignment
|
in parallel group trials to control patient-related variables
|
|
Random sequencing
|
in crossover trials to control carryover effects
|
|
Intention-to-treat analysis
|
in randomized controlled trials on treatment efficacy
|
|
Blinding (or masking)
|
in drug trials to control placebo effects
|
|
Stratification and matching
|
in case-control studies to control patient-related factors
|
|
Regression
|
to control patient-related factors in an observational study
|
|
Longitudinal research
|
incidence
|
|
cross-sectional
|
prevalence
|
|
Statistical analysis has at least 5 functions
|
To summarize data (descriptive statistics) 50
To control potential confounding variables (e.g., by using regression) 51 To quantify relationships (e.g., calculating partial eta squared) 52 To estimate an outcome (e.g., by making predictions from a regression equation) 53 To make inferences about population parameters from sample statistics (inferential statistics) 54 This involves the use of a test statistics such as chi-square, Z, t, or the F-ratio |
|
nominal variable
|
identify or classify but do not reflect quantity
|
|
ordinal variable
|
reflect a rank ordering of quantity
|
|
Lack of normality is indicated by
|
Positive or negative skewness
Positive or negative kurtosis |
|
Histograms and stem-and-leaf plots
|
good for displaying the shapes of distributions
|
|
Box plots
|
good for spotting outliers
|
|
standard error (SE)
|
measures the stability of a sample statistic
|
|
sample statistic
|
used to estimate the value of the corresponding population parameter
For example, a sample mean is used to estimate the value of the population mean An accurate estimator of a population parameter has a low SE |
|
SE of the mean will be low if
|
The size of the sample is large or
The standard deviation of the sample data is small |
|
If a large number of samples of equal size are drawn from the population, and
If the population data are normally distributed or the sample size is large, then |
About 95% of sample means will be within 2 SEs of the first sample mean
|
|
Log transformations are common
|
Used when the original variable is positively skewed or group variances are unequal
Usually the natural logarithm or the log to the base is used Taking the antilog or exponent of the log returns the data to its original units The antilog or exponent of the mean of a log transformation is called the geometric mean |
|
test statistic
|
gives the likelihood of the sample statistic if the null is true 119
The likelihood is the p-value |
|
null
|
rejected in favor of alternative if p is small enough
Usually the cutoff is .05, but sometimes .10 or .01, depending on the context |
|
One sample t-test
|
Used to test hypotheses about the value of a population mean
Its p-value depends on the value of t and the number of degrees of freedom (df) |
|
If normality is not present
|
a robust test statistic or a non-parametric test statistic is used
But non-parametric tests have less statistical power than parametric tests |
|
sample proportion
|
used to infer the prevalence or incidence of disease in a population 142
A sample proportion is used to infer the value of a population proportion by 143 Constructing a confidence interval around the sample proportion 144 Testing null against alternative hypotheses about the population proportion |
|
Z
|
the test statistic if the underlying assumptions involving the sample size are met
|
|
odds of disease
|
equal to the ratio of the probability of disease to 1 – the probability
An odds can be no lower than 0 but it has no upper limit |
|
Chi-square
|
The p-value depends on the value of Χ2and the number of degrees of freedom (df)
As with any test statistic, a Type I error can occur |
|
Cramér’s V
|
(if at least one variable is nominal)
varies from 0 to 1 |
|
Gamma
|
if both variables are ordinal
-1 to +1 |
|
column variable
|
results of the criterion test
|
|
row variable
|
results of the screening or diagnostic test
|
|
Accuracy equals
|
the proportion of all classifications that are true positives and true negatives
Accuracy does not reveal if the test makes too many false positive or false negative errors |
|
Sensitivity equals
|
true positive rate
Low sensitivity equals a high false negative rate False negative rate = 1 - sensitivity |
|
specificity
|
true negative rate
Low specificity equals a high false positive rate False positive rate = 1 – specificity |
|
likelihood ratio
|
the ratio of the true positive rate to the false positive rate
|
|
posterior odds
|
equals its prior odds times the likelihood ratio of the test
Fagan’s nomogram is a visual shortcut for estimating posterior probabilities |
|
ROC curves
|
used to choose cutoff values for tests generating quantitative results
A ROC curve is a plot of sensitivity against 1- specificity The curve helps identify a cutoff that yields optimal levels of sensitivity and specificity The area under the curve indicates the test’s overall diagnostic value The null hypothesis is that the population area is .50. |
|
scatter plot
|
displays the relationship and the best fitting straight line
The y-axis is the response or dependent variable (DV) The x-axis is the explanatory or independent variable (IV) |
|
R Squared
|
the proportion of variability in the DV accounted for by the IV
|
|
Pearson correlation coefficient (r)
|
indicates the strength and direction of the relationship
It varies from -1 to +1 |
|
Spearman’s rho (ρ)
|
used as an alternative to the Pearson correlation if
The relationship is non-linear but monotonic, or One or both variables are not normally distributed Spearman’s rho is interpreted in the same way as the Pearson correlation |
|
independent-samples t-test
|
yields a test statistic for comparing two independent group means
The null hypothesis is that the two population means are equal The alternative hypothesis can be one- or two-tailed The p-value depends on the value of t and the number of degrees of freedom (df) |
|
Levene’s Test
|
checks the assumption that the population variances of the two groups are equal
The null hypothesis is that the population variances are equal checks the assumption that the population variances of the groups are equal If the null is rejected, a robust test statistic is used in place of the F-ratio The alternative hypothesis is that the population variances are not equal If the null is rejected, the values of t and df are modified |
|
one-way analysis of variance
|
used for comparing three or more means
The null hypothesis is that all of the population means are equal 242 The alternative hypothesis is two-tailed The p-value depends on the value of the F-ratio and its numerator and denominator df |
|
partial eta squared
|
determine the strength of the relationship
An effect can be small and yet statistically significantly different from zero |
|
post hoc
|
compares means of two groups
|
|
contrast
|
compares the means of the two subsets of groups
|
|
Paired comparisons analysis
|
compares quantitative measurements of the same group taken twice
The null hypothesis is that the two population means are equal The alternative is one- or two-tailed The paired-samples t-test is used to generate a test statistic (t) The p-value depends on the value of t and the number of degrees of freedom (df) |
|
Repeated measures analysis
|
compares quantitative measurements taken more than twice
The null hypothesis is that all of the population means are equal The alternative hypothesis is two-tailed The repeated measures analysis of variance is used to generate a test statistic (F-ratio) The p-value depends on the value of the F-ratio and its numerator and denominator df |
|
means plot
|
often used to display the mean of each measurement
|
|
Mauchly’s Test
|
With 3 or more means, the assumption that sphericity is present is tested with Mauchly’s Test
The null hypothesis is that sphericity is present The alternative hypothesis is that sphericity is not present If the null is rejected, the numerator and denominator df are modified accordingly |
|
main effect
|
The individual effect of each IV
A main effect is indicated by a significant difference among the marginal means of the IV |
|
interaction effect
|
This happens when the combined effect of the 2 IVs differs from sum of their main effects
It is indicated by significant changes in the effect of one IV across the levels of the second IV If an interaction effect is present, the interaction means plot will display non-parallel lines |
|
“Best fitting”
|
determined by the least squares principle
That is, the sum of the squared residuals should be as small as possible |
|
intercept is the value of the DV when
|
the IV is zero
|
|
slope coefficient
|
shows the direction of the relationship between the DV and IV
The direction can be positive or negative, so the slopes will be positive or negative The slope coefficient also shows the degree of relationship between the DV and IV |
|
Unstandardized
|
Shows how much the DV changes on average for 1 unit change in the IV
|
|
Standardized
|
Shows how many SDs the DV changes on average for 1 SD change in the IV
|
|
sum of squares (SS)
|
The total variability in the DV is measured in terms of the sum of squares (SS)
Total SS (TSS) is equal to the regression SS plus the residual SS That is, TSS equals SS accounted for by the regression line plus SS the line can’t account for |
|
correlation, R,
|
The correlation, R, between the DV and IV, can vary from -1 to +1
The goodness of fit of the regression line is assessed by the coefficient of determination ( R2 ) The coefficient is equal to the ratio of regression SS to total SS |
|
slope coefficient
|
shows the direction of the relationship between the DV and IV
The direction can be positive or negative, so the slopes will be positive or negative The slope coefficient also shows the degree of relationship between the DV and IV |
|
Unstandardized
|
Shows how much the DV changes on average for 1 unit change in the IV
|
|
Standardized
|
Shows how many SDs the DV changes on average for 1 SD change in the IV
|
|
sum of squares (SS)
|
The total variability in the DV is measured in terms of the sum of squares (SS)
Total SS (TSS) is equal to the regression SS plus the residual SS That is, TSS equals SS accounted for by the regression line plus SS the line can’t account for |
|
correlation, R,
|
The correlation, R, between the DV and IV, can vary from -1 to +1
The goodness of fit of the regression line is assessed by the coefficient of determination ( R2 ) The coefficient is equal to the ratio of regression SS to total SS |
|
Confidence intervals for mean predictions
|
narrow
|
|
Confidence intervals for either individuals or means
|
wider for extreme values of the IV
|
|
prediction equation
|
consists of an intercept and unstandardized slope coefficients
|
|
Standardized slope coefficients
|
reflect the relative impact of the IVs on the DV
|
|
R
|
multiple correlation coefficient
This is the correlation between predicted and actual values of the dependent variable |
|
Dummy Variable
|
Categorical IVs can be used but they must each have only 2 values
For a categorical IV with more than 2 values, dummy variables are created for each IV The number of dummy variables equals the number of categories minus 1 The group that is scored zero on all dummy variables is called the reference group Slope coefficients on dummy variables are interpreted in terms of the reference group |
|
logit
|
log of the odds (e.g., log of the odds that a patient has a given disease)
intercept and slopes are also logarithms |
|
exponent of the logit
|
the odds that a given patient has the disease
|
|
exponent of the intercept
|
the baseline odds
This is the odds of disease for a patient for whom the values of the IVs are zero 376 For example, the odds for a patient who has been exposed to none of the risk factors |
|
exponent of a slope coefficient
|
odds ratio (OR)
|
|
Wald statistic
|
tests the null hypothesis against a two-tailed alternative hypothesis
|
|
Adjustment
|
a useful way for controlling the effects of potential confounding variables
|
|
Kaplan-Meier method
|
used to generate a cumulative survival function
This shows the cumulative proportion of patients alive at a given point in time The function can take the form of a survival table or a graph |
|
Survival Fxn
|
survival function generates median and mean survival times
Median: Survival time at which 50% of patients have survived Mean: Area under the survival curve |
|
log rank test
|
generates a chi-square as the test statistic
|
|
hazard function
|
plots the instantaneous risk of death at given point in time
|
|
cumulative hazard function
|
plots the total risk up to a given point in time
|
|
Cox regression
|
used to fit the slope coefficient of the covariate 418
The slope coefficient is fitted to the data according to the principle of maximum likelihood 419 The regression assumes that the proportion of the two groups’ hazards is constant over time |
|
hazard ratio (HR)
|
The exponent of the slope coefficient
This is interpreted as a relative risk estimate (RR) |