• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/82

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

82 Cards in this Set

  • Front
  • Back
Hypothesis testing
The formal process of testing that an individual measurement or statistic (such as a mean) estimates some characteristic of a population.
Alpha level
The point in a sampling distribution in which one rejects Ho. Alpha is the probability of a Type I error.
Null hypothesis
In hypothesis testing, the assertion that an individual measurement or statistic (such as a mean) estimates (or points to) a reference population. The null hypothesis is often symbolized as Ho.
Confidence interval
The interval in which some population parameter (such as a mean) exists with a given level of confidence, such as 95%.
Sampling distribution
A distribution that is constructed of a statistic (such as a mean) from a very large number of equally sized samples.
Standard error
The standard deviation of a statistic (such as a mean) from a sampling distribution.
Rival hypothesis
The logical alternative to the null hypothesis.
T-distribution
A bell shaped sampling distribution that is derived with small samples and the population standard deviation is unknown.
Degrees of freedom
In statistical analysis, the number of numbers that are free to take on a value without restriction.
One sample z test
In hypothesis testing, a z calculation where one tests the hypothesis that a sample mean points to a reference population mean. For this test, the sample size needs to be at least 30 and the population standard deviation is known.
One sample t test
In hypothesis testing, a t calculation where one tests the hypothesis that a sample mean points to a reference population mean. This test is calculated when the sample size falls below 30 and/or the population standard deviation is unknown.
Binary variable
Takes on 2 values.
EDA
A method and philosophy of data analysis begun by John Tukey which is designed to uncover information in data without interference of outlying values.
Resistance
An EDA property in which a calculation is not highly affected by outlying data values.
Re-expression
An EDA principle in which the display of data is aided by the use of nonlinear transformations, such as a logarithm or square root.
Residuals
The difference between a measurement and the value of the measurement that is predicted by some mathematical model.
Revelation
The primary goal of EDA in which one can see information carried by one's data.
Glyph
An image that communicates information without words.
Median
An average that is the middle number in an order set of data. The median has half the data below it and half above it.
Upper and lower hinges
An EDA term for the median of the upper/lower half of a batch of data.
Hinge spread
An EDA term that is the difference between the upper and lower hinges. The hinge spread is often called the fourth spread.
Stem and leaf diagram
An EDA figure that displays a distribution of data.
One-line display
A stem-and-leaf diagram in which the leaves of each stem are shown on one line.
Two-line display
A stem-and-leaf diagram in which the leaves of each stem are shown on two lines. The symbols * and . are used for stems 0-4 and 5-9, respectively.
Five-line display
A stem-and-leaf diagram in which the leaves of each stem are shown on five lines. The symbols *, t, f, s, and . are used for stems 0,1, 2,3, 4,5, 6,7, 8,9, respectively.
Box plot
An EDA schematic diagram comprised of a box and two lines that show the distribution of data.
Depth of a number
An EDA term to denote how far a number is in from the highest or lowest number in a batch of data. The greatest depth is the median.
Outlier
An observation that is numerically distant from the rest of the data.
Side by side stem and leaf diagram
Two stem-and-leaf diagrams placed next to each other that use a common set of stems.
Location, central tendency
Another name for average.
Spread, variation
The degree to which numbers differ from central tendency.
Frequency histogram and polygon
A frequency graph composed of vertical or horizontal rectangles that touch each other.
Bin width
The width of an interval that is used in constructing a frequency table or histogram.
Letter value display
Letters stand in for values of a box plot
Grouped data
Frequency data that are displayed in bins or intervals.
Variability
The degree to which numbers differ from central tendency.
Standard deviation
A measure of variability in which squared deviations from the mean are averaged.
Trimmed mean
An average in which a certain portion of numbers are deleted from the highest and lowest ends of an ordered batch of data.
Coded table
Table with coded values based on letter value display.
Arithmetic mean
An average in which the numbers are added and divided by the number of numbers.
Harmonic mean
A type of average that is calculated by way of the reciprocals of numbers. Specifically, the harmonic mean is defined as the reciprocal of the arithmetic mean of the reciprocals of a specified set of numbers.
Quadratic mean (RMS)
An average that is computed primarily for positive and negative numbers in which zero is a reference point.
Tukey trimean
A weighted average that uses the median and hinges.
Weighted arithmetic mean
An arithmetic mean in which all the numbers are weighted differentially.
Smoothing
An analytic method in which an underlying relationship between two variables is calculated transcending the noise in data.
Z score
A standardized score in which the mean of a data set is subtracted from a number and the difference is then divided by the standard deviation. The calculation tells one how far a number is above or below the mean in terms of standard deviations.
Sample space
The total possible outcomes in calculating a probability.
Event
Some specified occurrence for calculating a probability.
Normal distribution
In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution, defined on the entire real line. It has a bell-shaped probability density function.
Poisson distribution
A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time, space, distance, area or volume if these events occur independently with a known average rate.
Standard normal deviate
A z score of a normally distributed variable in a population.
Addition rule
When two events, A and B, are mutually exclusive, the probability that A or B will occur is the sum of the probability of each event such that P(A or B) = P(A) + P(B).
Multiplication rule
When two events, A and B, are mutually exclusive, the probability that A AND B will occur is the product of the probability of each event such that P(A and B) = P(A) x P(B).
Sensitivity
The probability of testing positive given that a subject has some condition.
Specificity
The probability of testing negative given that a subject does not have some condition.
Law of large numbers
In probability theory, the average of the results obtained from a large number of trials will converge on the expected value, and will tend to become closer as more trials are performed.
Prior probability
The probability of an event before any other relevant information is known (e.g., the probability of having TB before any other information is taken into account).
Posterior probability
A conditional probability. The probability of an event after knowledge of some relevant informtion is taken into account (e.g., the probability of having TB given a + Mantoux test).
Standard normal curve
A normal distribution with a mean of 0 and a standard deviation of 1.
Parameters
A characteristic of a distribution (such as the mean) in a population.
Statistics
A characteristic of a distribution (such as the mean) in a sample.
T-scores
Used in testing, a score that reflects one's relative standing in a reference group with a particular mean and standard deviation.
Conditional probability
The probability of an event given that some condition has been met.
Positive predictive value
The probability of having a condition given that a subject tests positive.
Negative predictive value
The probability of not having a condition given that a subject tests negative.
Contingency table
A table constructed with at least two factors that reveal the intersection of all levels. A contingency table is used in factorial ANOVA and with the chi-square test for association.
Nominal scale
A classification of data in which the numbers represent categories (e.g., 1 = Male, 2 = Female, etc.).
Ordinal scale
A classification of data in which the numbers represent a variable only interms of order (e.g., 1 = Low, 2 = Medium, 3 = High).
Interval scale
A classification of data in which the numbers represent a variable that is understood to have equal intervals of amount. In an interval scale, "0" is relative in that it does not mean the absence of quantity. The most common interval scales are temperature in Fahrenheit and Celsius.
Ratio scale
A classification of data in which the numbers represent a variable that is understood to have equal intervals of amount and a true zero point. Examples of ratio scales are time, distance, and density.
Student's t-test for independent samples
A calculation that tests the hypothesis that two sample means point to (or estimate) the same population mean. The term "independent" indicates that each subject is in one and only one sample.
Student's t-test for related samples
A calculation that tests the hypothesis that a sample mean of measurement differences points to (or estimates) a population mean of zero. The term "related" indicates that subjects act as their own control or that two different subjects have been matched or linked in some way. Two measurements are collected per subject or pair and the sample mean is computed by subtracting one from the other.
Sampling distribution for the difference of two sample means
A sampling distribution which is formed by selecting two random samples of equal size from the same population and then constructing a distribution of the differences of the means in each sample. This sampling distribution reveals what the difference between two sample means is expected to be when no treatment effects are present.
Standard error for the difference of means
The standard deviation of the differences in sample means used to form a Sampling Distribution for the Difference of Two Means.
Pooled variance
The average of the variances of two independent samples for estimating the variance of a measurement in a population.
Homogeneous variance
The assumption that the variability of measurements is similar among all study groups.
Satterthwaite adjustment
The calculation that is done for an independent samples Student t-Test when the homogeneity of variance assumption is violated.
One-tail test
A test of statistical significance in which the rival hypothesis is stated in one direction.
Two-tail test
A test of statistical significance in which the rival hypothesis is not stated in any particular direction.
Non-parametric (rank sum) tests
A class of tests of hypotheses that make no or few assumptions about the nature of a population distribution.
Mann-Whitney U test
The non-parametric analog of the Student t test for independent samples.
Rank sum tests
An alternate name for those non-parametric tests that rank the data and then analyze the ranks.