Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
68 Cards in this Set
- Front
- Back
Descriptive Statistics
|
deals with methods of organizing, summarizing, and presenting data in a convenient and informative way.
|
|
Inferential statistics
|
is a body of methods used to draw conclusions or inferences about characteristics of populations based on sample data.
|
|
population
|
the group of all items of interest to a statistics practitioner
|
|
Parameter
|
A descriptive measure of a population. In most applications of inferential statistics the parameter represents information we need.
|
|
Sample
|
a set of data drawn from the studied population.
|
|
Statistic
|
a set of data drawn from the studied population. A descriptive measure of a sample is called a statistic.
|
|
Statistical inference
|
the process of making an estimate, prediction, or decision about a population based on sample data.
|
|
confidence level
|
the proportion of times that an estimating procedure will be correct.
|
|
Significance level
|
measures how frequently the conclusion will be wrong.
|
|
Mean
|
average. If you don't know this, go back to middle school.
|
|
Sample mean is denoted...
Population mean is denoted... |
"X" with a bar over it.
"mu" (greek letter) |
|
Median
|
Middle number
|
|
Mode
|
most common observation
|
|
Range=
|
Largest observation - Smallest observation
|
|
Variance and standard deviation
|
both used to measure variability
|
|
Population variance
Sample variance |
is represented by lower case "sigma" squared.
is represented by "s" squared. |
|
Population standard deviation
Sample standard deviation |
is represented by lower case "sigma"
is represented by "s" |
|
Empirical rule
|
1. Approximately 68% of all observations fall within one standard deviation of the mean.
2. Approximately 95% of all observation fall within two standard deviations of the mean. 3. Approximately 99.7% of all observations fall within three standard deviations of the mean. |
|
Chebysheff's Theorem
|
The proportion of observations in any sample or population that lie within "k" standard deviations of the mean is at least...
1-(1/(k^2)) for k>1 --------------------- it states that at least three-quarters (75%) of all observations lie within two standard deviations of the mean. with k=3, Chebysheff's Theorem states that at least eight-ninths (88.9%) of all observations lie within three standard deviations of the mean. |
|
Coefficient of Variation
|
of a set of observations is the standard deviation of the observations divided by their mean.
|
|
Population coefficient of Variation...
Sample coefficient of variation... |
CV= (lower case 'sigma')/'mu'
cv= s/(bar x) |
|
Percentile
|
The Pth ---------- is the value for which 'P' percent are less than that value and (100-P)% are greater than that value.
|
|
Location of a Percentile
|
=(n+1)P/100
|
|
Quartiles
|
25th, 50th, and 75th percentiles
|
|
Interquartile Range
|
=Q3-Q1
|
|
Box plot
|
This technique graphs five statistics: the minimum and maximum observations, and the first, second, and third quartiles.
|
|
Outliers
|
considerably large or small observations where the validity of the datum is suspect.
|
|
Covariance
|
Learn the formula... see page 127 in the book... also check out graphs and shit on page 130.
|
|
Coefficient of correlation
|
the covariance divided by the standard divinations of the variables. check out graphs and other info on 130.
|
|
Least squares method
|
produces a straight line drawn throughout the points so that the sum of squared deviations between the points and the line is minimized. The line is represented by... y (with a hat on it)= b(sub)0+b(sub)1x
see page 132 |
|
Coefficient of determination
|
calculated by squaring the coefficient of correlation. This measures the amount of variation in the dependent variable that is explained by the variation in the independte variable. More info on page 139
|
|
Don't forget that Correlation is NOT the same as Causation.
|
Just don't do it.
|
|
Direct Observation
|
Simplest way to collect data... relatively inexpensive
|
|
Experimental...
|
a more expensive, but better way to produce data through experiments.
|
|
Surveys
|
we know what this is.
|
|
Response rate
|
the proportion of all people who were selected who complete the survey.
|
|
target population
|
the population about which we want to draw inferences
|
|
sample population
|
the actual population from which the sample has been taken.
|
|
simple random sample
|
is a sample selected in such a way that every possible sample with the same number of observations is equally likely to be chosen.
|
|
Stratified random sample
|
is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum.
|
|
Cluster sample
|
is a simple random sample of groups or clusters of elements.
|
|
Sampling error
|
refers to differences between the sample and the population that exists only because of the observations that happened to be selected for the sample.
|
|
Non-sampling error
|
results from mistakes made in the acquisitions of data or from the sample observations being selected improperly. (errors in data acquisition, non-response error, selection bias)/
|
|
Random experiment
|
is an action or process that leads to one of several positive outcomes
|
|
Sample space
|
of a random experiment is a list of all possible outcomes of the experiment. The outcomes must be exhaustive and mutually exclusive.
|
|
event
|
is a collection or set of one or more simple events in a sample space.
|
|
Probability of an event
|
is the sum of the probabilities of the simple events that constitute the event.
|
|
Intersection of Events A and B
|
is the event that occurs when both A and B occur. it is denoted as
A and B The probability of the intersection is called 'joint probability'. |
|
Marginal probability
|
computed by adding across rows or down columns are so named because they are calculated bin the margins of the tables.
|
|
Conditional probability
|
The probability of event A given event B is...
P(A I B) = P(A and B)/P(B) |
|
Union of Events A and B
|
Union of events A and B is the event that occurs when either A or B or both
|
|
Complement
|
of even A is the event that occurs when even A does not occur. Denoted by A^C
|
|
Complement Rule
|
P(A^C)=1-P(A)
for any event A. |
|
Multiplication Rule
|
is used to calculate the joint probability of two events. It is based on the formulate for conditional probability. That is, from the following formulate...
P(A I B)= P(A and B)/P(B). We derive the multiplication formula simply by multiplying both sides by P(B). So that... The joint probability of any two events A and B is P(Aand B) = P(B)P(A I B) or, altering the notation, P(A and B) = P(A)P(B I A) |
|
Multiplication rule for Independent Events
|
The joint probability of any two independent events A and B is
P(A and B)=P(A)P(B). |
|
Addition Rule
|
enables us to calculate the probability of the union of two events.
The probability that event A, or event B, or both occur is P(A or B)= P(A) + P(B) - P(A and B). Check out page 193 and 194 though to see why we subtract the joint probability from the sum of the probabilities of A and B. |
|
Addition Rule for mutually exclusive Events
|
The probability of the union of two mutually exclusive events A and B is P(A or B) = P(A) + P(B)
|
|
Bayes's Law
|
used when we witness a particular event and we need to compute the probability of one of its possible causes. see pages 199-201.
|
|
Random variable
|
is a function or rule that assigns a number to each outcome of an experiment
|
|
discrete random variable
|
is one that can take on a countable number of values. For example, if we define X as the number of heads observed in an experiment that flips a coin 10 times, then the values of X are 0,1,2,...,10. The variable of X can assume a total of 11 values. Obviously, we counted the number of values; hence, X is discrete.
|
|
Continuous random variable
|
is one whose values are uncountable... for example, the amount of time need to complete a task.
|
|
Probability distribution
|
is a table, formula or graph that describes the values of a random variable and the probability associated with these values. page 219
|
|
For expected value, Population Mean, Population Variance,
|
see page 223
|
|
Laws of expected value and variance
|
we often create new variables the are functions of other random variables. we use Laws of expected value and variance to quickly determine the expected value and variance of these new variables. Check out page 225
|
|
bivariate distributions, covariance
|
provides probabilities of combinations of two variables... see 231-233.. do it.
|
|
Binomial experiment
|
consists of a fixed number of trials.
1. We represent the number of trials by n. 2. Each trial has two possible outcomes. We label one outcome a success, and the other a failure. 3. The probability of success is p. The probability of failure is 1 - p. 4. The trials are independent, which means that the outcome of one trial does not affect the outcome of any other trial. If properties 2,3, and 4 are satisfied, we say that each trial is a "Bernoulli process." The random variable of a binomial experiment is defined as the number of successes in n trials. It is called the binomial random variable. |
|
Poisson distribution
|
the Poisson random variable is the number of occurrences of events, which we;ll continue to call successes. The difference between the two random variables is that a binomial random variable is the number of successes in a set of trials, whereas a Poisson random variable is the number of successes in an interval of time or specific region of space.
Examples of Poisson random variables. 1. The number of cars arriving at a service station in 1 hour. (the interval of time is 1 hour). 2. The number of flaws in a bolt of cloth (the specific region is a bolt of cloth) 3. the number of accidents in 1 day on a particular stretch of highway ( The interval is defined both by time and by space.) |
|
Poisson probability distribution
|
the probability that a Poisson random variable assumes a value of x in a specific interval is...
see page 252 for formula |