• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/48

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

48 Cards in this Set

  • Front
  • Back
statistic
an estimate of the population, calculated from data in a sample drawn from the population
Parameter
the term used to describe a characteristic of a population
selecting a statistical technique, based on...
1. how many variables you're looking at
2. the type of data you have
3. where or not the data is "normally" distributed
4. # and size of samples
5. whether or not the samples serve as their own controls
-same-subjects (paired samples) OR
between-subjects (independent samples)
6. Whether you’re trying to find a difference between samples or establish a relationship between variables
discrete
can only take on certain values
Nominal (Categorical)
Ordinal
continuous
can take on (almost) any value; often reported using specific values (i.e. BP)
Interval
Ratio
nominal (categorical) data
-non-orderable qualitative discrete categories
-ex. race, ethnicicy
-subset is dichotomous (binary): (gender; presence of absence of something)
ordinal data
-reflects a “greater or lesser” degree of something, or it may reflect a “precedes” or “superior” concept
-In contrast to nominal data, the numbers or discrete categories can be placed in a meaningful order
-However, the interval between the categories cannot be assumed to be equal
ordinal data: types
1. Ordered-Categories Data
Example: Cancer Stage
“I” is “better” than “II”, but not necessarily the same amount of difference between them as between “III” and “IV”
2. Rank-Ordered Data:
e.g., a committee may rank-order a series of program proposals and assign 1st, 2nd, 3rd, 4th, 5th, 6th, and 7th place to each. The top-ranked proposal may be considerably better than the ones ranked 2nd or 3rd……whereas the 4th ranked might not be so different from the 5th, 6th or 7th…
continuous data
- Some continuous data, such as temperature, can take on any value, whereas other continuous data can only be whole numbers, such as # of children in a family.
interval data
-data is in a logical sequence
-the intervals are equal and represents actual amts
-there isn't necessarily an absolutes (meaningful) zero point..
ex. IQ, quality of life
ratio data
-numbers that are continuous with equal intervals between them
- have a meaningful zero point
zero = total absence of whatever is being measured e.g., BP, income
descriptive statistics
-used to describe, simplify and organize data
1.Frequency distribution tables/graphs
2. Measures of Central Tendency (mean, mode, median)
3.Measures of Variability (range, standard deviation)
4.z-Scores
z-scores describe individual scores within a distribution
frequency distributions
-systematic arrangement of numeric values of a variable from lowest to highest, and a count of the number of times each value was obtained
-tabular or geographically
advantages: presents entire set of scores; does not condense into single descriptive value
disadvantages: can be complex with large sets of data
shapes of distributions
1. symmetric vs. skewed
2. peakedness
3. modality (# of peaks)
-unimodel, bimodel, multimodal
measures of central tendency
1. mean: equals the sum of all scores divided by the total number of scores (most stable and widely used)
2. median: the point in a distribution above which and below which 50% of cases fall (descriptor of typical value when distribution is skewed)
3. mode: the most frequently occurring score in a distribution (useful as a gross descriptor, esp with nominal data)
variability
-The degree to which scores in a distribution are spread out or dispersed
Homogeneity—little variability
Heterogeneity—great variability
indexes of variability
1. range
2. deviation score: how far any element in a distribution is from the mean
3. variance: mean of the squares of all the deviation scores
4. standard deviation: square root of the variance
standard deviation
-% of elements in a normal distribution is constant for a given # of SDs above or below the mean
z-scores
-to identify the precise location of a specific value within a distribution by using a single number
-The sign (+ or -) of the score indicates whether an individual is above or below the mean
-Indicates how many standard deviations there are between the score and the mean (SAT and ACTs)
inferential stats
-used to make interferences about the population based on sample data
1. Parametric tests
2. nonparametric tests
3. tests of relationship b/t variables
-always preferable to analyze your data using parametric statistics
-However, when ANY of the parametric statistical conditions are not met, NON-parametric statistics are used
parametric tests
-tests that involve use of specific parameters and assumptions about a population (i.e. normal distributions and homogeneity of variance)
-Goal: to determine whether the observed Mean differences (either within groups or between groups) are larger than expected by chance (a significant result)
basic computations for parametric tests
1. sets of values (sample) is obtained for each treatment conditions
2. the mean and some measure of variability (SD) is computed
3. the difference b/t the sample means provides a measure of how much difference exists b/t groups or treatment
sampling distribution of the means
-If we took an infinite # of samples, and plotted all their means, you would get a normal curve
-the SD of this distribution is called the standard error of the mean (SEM)
-The formula to calculate this for a sample is based on the sample size (SEM decreases in proportion to the square root of the sample size)
-Use this to state how “confident” we are that the mean for the sample we have chosen is close to the mean for the whole population
95% confidence interval
-95% of the random sample means will fall within 1.96 SEMs (SDs) of the true population mean OR
-95% of the time, a random sample mean will fall within 1.96 SEMs (SDs) of the true population mean
confidence interval
-useful in telling us how large (or small) the effect size might actually be
-The upper and lower values of this range are called the “confidence limits”
-larger sample sizes give narrower ranges
-If the CI includes the value that indicates no difference between groups, the effect is not considered to be statistically significant
t-test
-uses data from sample(s) to test a hypothesis about a population in situations where the population SD is unknown (we have to use an estimated SEM)
-Make inferences about population means and mean differences from the sample mean(s) or mean differences
-1-tailed for directional hypoth
-2-tailed for nondirectional
T score values vary with sample size
as sample size gets smaller, t and z scores become increasingly different, and so
tables include degrees of freedom (basically n-1)
single-sample t-test
-uses data from a single sample to test a hypothesis about a population
ghe indepdendent sample t-test
-uses data from two separate samples to test a hypothesis about the difference between two population means
the related sample (paired) t-test
(parametric test) for high order data
evaluates the mean difference between 2 tx conditions using data from a repeated-measures or matched-subjects experiment. A difference score (D) is obtained for each subject by subtracting the score in tx 1 from the score in tx 2a
analysis of variance (ANOVA)
-used when more than one compsarison is being made
-Attempt to determine if variability in results is just due to ordinary random variation or to known differences between the groups (IV)
F= variance b.t group/variance b/t groups
1 way: used if subjects in groups differ in only one factor
2. 2 way: used if sujects in groups differ in more than one factor
nonparametric situtations where parametric tests cannot be used
data is on nominal or ordinal scales; means and variances cannot be computed (an essential part of parametric tests)
the data do not satisfy the assumptions underlying parametric tests (normal distribution)
Hypotheses do not refer to population parameters
chi-square or goodness of fit
(nonparametric test)
-used when frequency data classifies individuals into distinct categories. Tests for differences between the expected and observed proportion of individuals with a given outcome (null hypothesis would be that there is no difference)
-the larger the value, the more likely the difference is not due to chance
the mann-whitney U test (nonparametric test)
-uses ordinal data (rank orders) from two separate samples to test a hypothesis about the difference between 2 populations or 2 treatment conditions
Wilcoxon T test (nonparametric test)
-uses the data from a repeated measures or matched-samples design to evaluate the difference between 2 tx conditions (used as an alternative to the related-sample t test in situations where the data can be rank ordered but do not satisfy the more stringent t test requirements)
relationship between variables
1. Correlation: used to establish and quantify the strength and direction of relationship between two variables
2. Regression: used to express the functional relationship between two variables (value of 1 variable is used to predict the value of the other)
Pearson correlation
-measures the degree of linear relation between 2 variables. The sign (+ or -) indicates the direction of the relationship. The magnitude (0 to 1) indicates the degree to which the data points fit on a straight line (interval or ratio data)
Spearman Correlation
-measures the degree to which the relationship between two variables is one-directional. Used when both variables (X and Y) are ranks measured on an ordinal scale
Linear regression
-purpose is to find the equation for the best fitting straight line for predicting Y scores from X scores
Chi-square test for Independence
-uses frequency data to determine whether or not there is a significant relationship between two variables. It can be used with nominal, ordinal, interval or ratio scales for predicting Y scores from X scores
measure of effect
a measure that provides an interpretable value for the subject under study
Example – difference between sample means
measure of significance
a measure that indicates whether the observed effect is likely due to chance or a true difference
Example – p-value
clinical hypothesis
-states the expected results of the research
null hypothesis (Ho)
-which usually states that there is NO difference between two (or more) populations or that two (or more) variables will NOT be correlated
(alternate hypothesis (Ha) is usually based on the assumptions of the research hypothesis)
Type 1 error **
-rejection of a null hypothesis when it should not be rejected—a false positive study result
-risk is controlled by the level of significance (alpha-tails on the curve), usually set at .05 or .01 (likelihood a result has occurred by chance)
P-value
-Probability of making a Type I error if H0 is rejected
type II error
-failure to reject a null hypothesis when it should be rejected—a false negative study result (results were, in fact, NOT due to chance, but were related to the study variable)
-Defined as β, and is usually set at 0.20
Power
-The ability of a study to detect an outcome of interest; the probability that a false null hypothesis will actually be rejected
-If a study does not have enough Power, it may not be able to detect an outcome of interest even if the study results are positive
-defined as 1-beta and usually set at .80
-do calculations ahead of time
-best way to inc the power is to inc the sample size