Probability of a type 1 error
also called significance level
Alternative Hypothesis
The hypothesis stating what the reseatch is seeking evidence off
a statement of inequality
it can be written looking for hte difference or chanvce in one direction from the null of in both (<,>,/=)
Bar Chart
a graphical display used with categorical variables
frequencies are shown in vertical bars NOT touching
relationship between or among variables
used to describe the normal distribution
resembles a hill or mound
disrtibution that is symmetric & unimodal
*the probability of a Type II error
*in the linreg it is the slope
closely related to power
Biased Statistic
A sampling method is biased if it tends to produce samples that do not represent the population
systematic deviation from the parameter caused by systematically favoring some outcomes over others
a distribution of data with two clear peaks
Binomial Distribution
the probability of a binomial random vraiabloe
Binomial Random Variable
a random variable X which has 4 assumptions
Simple RAndom Sample
a sample of n individuals that are selected from a population in a way that every possible combination of n individuals is equally likely
Influential Outlier
an extreme value whose removalwould drastically change the slope of the least-squares regression model line.
Binomial Distribution Assumptions
1. A fixed number of trials of a random phenomenon, n
2, that has only two possible outcomes, success and failure
3. Probability of each success is constant for each trial
4. Each trial is independent
Bivariate Data
consists of two variables, and explanatory and response variable
usually quantitative
practive of denying knowledge to subjects about which treatment is imposed upon them cuts down biasness
subgroups of the experimental units that are spearated by some characteristic before treatments are assigned because they may respond differently to treatments [men and women]
Boxplot/Box & Whisker Plot
graphical display of the five number summary of a set of data, shows outliers
Categorical Variable
recorded as labels, names, or other non-numerical outcomes
a study that observes or attempts to observe, every individual in a population, ALWAYS THE BEST SAMPLE
Central Limit Theorm
as size n of an srs increases, the shape of the samiloing distibution of X bar tends toward being normall distributed
Chance Device
a mechanism used to determine random outcomes [dice, coins, spinner, ect.]
Cluster Sample
a sample in which an sts of heterogenous subgroups of a population is selected
heterogeneous subgroups of a population
Coefficient of Determination
r^2 percent of variation between the x and y variables that can be explained by its linear relationship
(1-A) (1-B) an event that is not occuring
Complementary Events
two events whose probabilites add up to one
Conditional Frequencies
relative frequencies for each cell in a two way table relative to one variable
Conditional Probability
the probability of an event occuring given that another has occurred (A given B)
Confidence Intervals
give an estimated range that is likely to contain an unknown populated parameter
Confidence Level
the level of certainty that a population parameter exists in the calculated interval
the situation where the effects of two or more explanatory variables on the response variable can not be separated
Confoundung Variable
a variable whose effect on the response variable can not be untangled from the effects of the treatment
Continuous Random Variables
found by measuring [heights]
Control Group
a baseline group that may be given no treatment, a faux treatment [placebo] or an accepted treatment that is to be compared to another treatement
Five Number summary
used with median values: minimum value, first quartile, median, third quartile, and maximum value of data sets
Frequency Table
a display organizing categorical or numerical data and how often each occurs
the middle value of a data set, the equal areas point, where 50% of the data are at or below this value & 50% is above or at. Resistant to outliers and works with the five number summary
the smallest number in the data set
Observed value minus the expected value of the response variable (y-yhat)
Null Hypothesis
Hypothesis of no difference, no change, and no association: a statement of equality usually written in the form of: parameter = hypothesized value
Normal Distribution
a continuous probability distribution tht apreas in many situations, both natural and man made [bell-shaped, area under is always 1]
Nonresponse Bias
the situation where an individual sleected to be in the sample is unwilling, or unable to provide data
Mean of a binomial Random Variable X
Coerrelation Coefficient
r. a measure of the stregnth of a linear relationship [linreg, diagnositics on]