Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
124 Cards in this Set
- Front
- Back
What is a probability distribution for a discrete random variable?
|
A mutually exclusive listing of all possible numerical outcomes for that variable such that a particular probability of occurrence is associated with each outcome.
|
|
What is the mean of a probability distribution?
|
the expected value of its random variable
|
|
What is the covariance?
|
Measure of the strength of the relationship between two discrete random variables, X and Y.
|
|
What is the expected sum of two random variables?
|
it is equal to the sum of the expected values
|
|
What is the variance of the sum of two random variables?
|
equal to the sum of the variances plus twice the covariance
|
|
What is the standard devisation of the sum of two random variables?
|
the square root of the variance of the sum of two random variables
|
|
What is the rule of combinations?
|
the number of ways of selecting X objects from n objects, irrespective of sequence
|
|
What is the mean of the binomial distribution?
|
equal to the sample size, n, multiplied by the probability of success
|
|
What is the area of opportunity?
|
a continuous unit or interval of time, volume, or such area in which more than one occurrence of an event can occur
|
|
What is the Poisson distribution?
|
a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.
|
|
When can you uses the Poisson distribution?
|
1)when counting the # of times a part. event occurs in a given area of opportunity;(2)prob. that event occurs in a given area of opp. is the same for all the areas of opp.; (3) # of events in one area of opp. is indep. of the # of events that occur in any other area of opp.; (4) prob. that 2 or more events will occur in an area of opp. approaches 0 as the area of opp. becomes smaller
|
|
What is the hypergeometric distribution?
|
discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement
|
|
What is the difference btwn a binomial and hypergeometric distribution?
|
binomial dist. sample data selected with replacement from a finite pop. or without replacement from an infinite pop. Hyper dist. data selected without replacement from a finite pop.
|
|
What is a continuous probability density function?
|
mathematical expression that defines the distribution of the values for a continuous random variable
|
|
What is a normal distrubtion?
|
(Gaussian distribution) most common continuous distribution used in stats
|
|
3 reasons normal distribution is important?
|
(1)numerous cont. variables common in business have dist. that closely resemble the normal distribution; (2) normal dist. can be used to approximate various discrete prob. dist.; (3) normal dist. provides the basis for classical statistical inference b/c of it's relationship to the central limit theorem
|
|
What is the normal probability density function?
|
mathematical expression representing a continuous probability density function - denoted (f)X
|
|
What is the transformation formula?
|
when the Z value is equl to the difference between X and the mean divided by the standard deviation
|
|
What is the cumulative standardized normal distribution?
|
f
|
|
What is an X value?
|
equal to the mean plus the product of the Z falue and the standard deviation.
|
|
What is a normal probability plot?
|
graphical approach for evaluating whether data are normally distributed
|
|
What is quantile-quantile plot?
|
transforming each ordered value to a Z value and then plot the data values versus the Z values
|
|
What is a uniform distribution?
|
when a value has the same probability of occurrence anywhere inthe range between the smallest value, a, and the largest value, b. Sometimes called the rectangular distrubtion.
|
|
What is the exponential distribution?
|
continuous distrubution that is right skewed and ranges from zero to positive infinity
|
|
What is a frame?
|
a listing of items that make up the population; they are data sources such as population lists, directories, or maps (samples are drawn from frames)
|
|
What is a nonprobability sample?
|
when you select the items or individuals without k nowing their probabilities of selection
|
|
What is convenience sampling?
|
Type of nonprobability sampling - items are sleected based only on the fact that they are easy, inexpensive, or convenient to sample
|
|
What is a judgment and sample?
|
you get the opinions of preselected experts in the subject matter
|
|
What is the probability sample?
|
you select the items based on known probabilities
|
|
What is a simple random sample?
|
every item from the frame has the same chance of selection as every other item
|
|
What is sampling with replacement?
|
After you select an item, you return it to the frame where it has the same probability of being selected again
|
|
What is sampling without replacement?
|
Once you select an item, you cannot select it again
|
|
What is a table of random numbers?
|
consists of a series of digits listed in a randomly generated sequence
|
|
What does a table of random numbers do?
|
Because the numeric system uses 10 digits, the chance that you will randomly generate any particular digit is equal to the probability of generating any other digit.
|
|
Systematic sample
|
partition the N items in the fram into n groups of k items where... k=N/n
|
|
Least efficient sampling methods
|
simple random sampling and systematic sampling
|
|
Stratified sample
|
subdivide the N items in the frame into separate subpopulations, or strata
|
|
What is a strata?
|
some common characteristic, such as gender or year in school
|
|
What is a cluster sample?
|
divide N items in the frame into several clusters so that each cluster is representative of the entire population - more cost effective but requires larger sample size
|
|
What is a cluster?
|
naturally occurring designations, such as counties, election districts, city blocks, households, or sales territories
|
|
4 types of survey errors
|
coverage; nonresponsive; sampling; measurement
|
|
What is a coverage error?
|
occurs if certain groups of items are excluded drom this frame so that they have no chance of being selected in the sample
|
|
What is a selection bias
|
if the frame is inadequate because certain groups of items in the population were not properly included
|
|
What does a table of random numbers do?
|
Because the numeric system uses 10 digits, the chance that you will randomly generate any particular digit is equal to the probability of generating any other digit.
|
|
Systematic sample
|
partition the N items in the fram into n groups of k items where... k=N/n
|
|
Least efficient sampling methods
|
simple random sampling and systematic sampling
|
|
Stratified sample
|
subdivide the N items in the frame into separate subpopulations, or strata
|
|
What is a strata?
|
some common characteristic, such as gender or year in school
|
|
What is a cluster sample?
|
divide N items in the frame into several clusters so each cluster is representative of the entire population
|
|
What is a cluster?
|
naturally occurring designations, such as countries, election districts, city blocks, households, etc.
|
|
Pros/cons with cluster samples?
|
more cost effective but require larger sample size
|
|
4 types of survey errors
|
coverage; nonresponse; sampling; measurement
|
|
What is a coverage error?
|
certain grps of items excluded from this frame so no chance of being selected in sample
|
|
What is a selection bias?
|
frame is inadequate because certain groups of items in the population were not properly included.
|
|
What is a nonresponse error?
|
failure to collect data on all items in the sample
|
|
What is a nonresponse bias
|
certain groups left out of population
|
|
How to avoid nonresponse error
|
follow up with people who did not respond; use best mode for response (ie telephone, not mail)
|
|
What is a sampling error?
|
reflects the variation from sample to sample
|
|
3 sources of measurement error
|
ambiguous wording of questions
the halo effect respondent error (can recall the the individuals with unusual responses and establish program of random callbacks) |
|
Ethical issues
|
coverage: purposeful exclusion of particular individuals
nonresponse: sponsor knowingly designs survey so particular groups are less likely to respond sampling error: findings are purposely presented without reference to sample size and margin of error measurement: leading questions, creates halo effect or provides false info. |
|
Ethical issues in nonprobability sampling methods
|
used to form conclusions about an entire population - must state the results cannot be generalized beyond the sample
|
|
What is a sampling distribution?
|
distribution of the results if you actually selected all possible samples
|
|
What is the sampling distribution of the mean
|
distribution of all possible sample means if you select all possible samples of a certain size
|
|
What is the population mean?
|
sum of the values in teh population divided by the population size, N.
|
|
What is the standard error of the mean?
|
how the means vary from sample to sample
|
|
What is the Central Limit Theorem?
|
states that as the sample size (# of values in each sample) gets large enough, the sampling distribution of the mean is approximately normally distributed. True regardless of the shape of the distribution of the individual values in the population
|
|
What is a point estimate?
|
value of a single sample statistic
|
|
What is a confidence interval estimate?
|
range of numbers, called an interval, constructed around the point estimate
|
|
What reasoning is used in confidence interval estimation for the mean (known)
|
inductive reasoning - use resulsts of a single sample to draw conclusions about a population
|
|
What is the critical value?
|
value of Z needed for constructing a confidence interval.
|
|
Student's t distribution
|
used when standard deviation is not known
|
|
What is a null hypothesis?
|
one of status quo and is identified by the symbol Ho
|
|
What is the alternative hypothesis H1
|
opposite of the null hypothesis
|
|
If you reject the null hypothesis.....
|
you have statistical proof that the alternative hypothesis is correct
|
|
If you do not reject the null hypothesis.....
|
you have failed to prove the alternative hypothesis. Failure to prove the alternative hypothesis, however, does not mean that you have proven the null hypothesis.
|
|
The null hypothesis always....
|
refers to a specified value of the population parameter such as the mean of the population, not a sample statistic sch as the mean of the sample
|
|
The statement of the null hypothesis always.....
|
continas an = regarding the specified value of the population parameter (Ho: pop. mean = 368 grams)
|
|
The statement of the alternative hypothesis never....
|
contains an equal sign regarding the specified value of the population parameter
|
|
What is a test statistic?
|
estimate of the corresponding parameter
|
|
What is a region of rejection (critical region)?
|
consists of values of test statistic that are unlikely to occur if the Ho is true.
|
|
What are the two regions of the sampling distribution of the test statistic?
|
region of rejection and region of nonrejection
|
|
What is the critical value?
|
divides the nonrejection region from the rejection region.
|
|
How to determine critical value
|
depends on size of the rejection region
|
|
What is a Type I error?
|
if you rejct the Ho when it is true and should not be rejected. P(Type I error is alpha)
|
|
What is a Type II error?
|
if you do not reject the Ho when it is false and should be rejected. P(Type II error is beta)
|
|
What is the level of significance?
|
Probability of committing a Type I error
|
|
What is the confidence coefficient?
|
1-alpha: probability that you will not rejct the Ho when it is true and should not be rejected.
|
|
What is the confidence level?
|
(1-alpha) x100%
|
|
Dependent variable
|
the variable you wish to predict
|
|
Independent variable
|
variables used to make the predictions
|
|
Simple linear regression
|
single numerical independent variable, X, is used to predict the numerical dependent variable, Y
|
|
Scatter plot (scatter diagram)
|
examines relationship between X on horizontal axis and Y on vertical axis - nature of relationship btwn these two can range from simple to extremely complicated
|
|
Simple linear regression model
|
consists of a straight-line
|
|
Dependent variable
|
the variable you wish to predict
|
|
Independent variable
|
variables used to make the predictions
|
|
Simple linear regression
|
single numerical independent variable, X, is used to predict the numerical dependent variable, Y
|
|
Scatter plot (scatter diagram)
|
examines relationship between X on horizontal axis and Y on vertical axis - nature of relationship btwn these two can range from simple to extremely complicated
|
|
Simple linear regression model
|
consists of a straight-line
|
|
What is a slope?
|
the expected change in Y per unit change in X
|
|
What is the Y intercept?
|
the mean value of Y when X equals 0
|
|
What is a positive linear relationship?
|
Y increases as X increases
|
|
What is a negative linear relationship?
|
Value of Y decreases as X increases
|
|
What is a positive curvilinear relationship?
|
Value of Y increases as X increases, but this increase tapers off beyond certain values of X
|
|
What is a U-shaped curvilinear relationship?
|
As X increases, at first Y generally decreases; but as X continues to increase, Y not only stops decreasing but actually increases above it's minimum value
|
|
What is a negative curvilinear relationship?
|
exponential relationship between X and Y - Y decreases rapidly as X first increases, but then it decreases much less rapidly as X increases further
|
|
What is no relationship?
|
High and low values of Y appear at each value of X
|
|
What is the relevant range?
|
includes all values from smallest to largest X used in developing the regression model
|
|
What is a mathematical model?
|
mathematical expression that represents a variable of interest
|
|
What is the binomial distribution?
|
When the discrete variable of interest is the number of successes in a sample of n observations.
|
|
Essential properties of a binomial distribution
|
1. fix # of observations in sample
2. is mutually exclusive and collectively exhaustive (success or faillure) 3. P(success) is constant from obser. to obser. 4. outcome is independent of the outcome of any other observation |
|
Name 3 discrete distributions
|
binomial, poisson, hypogeometric
|
|
Name 3 continuous distributions
|
normal, uniform, and exponential
|
|
What is the Poisson distrubtion?
|
the probability of X events in an area of opportunity
|
|
When to use Poisson distrubtion
|
1) area of opportunity is defined by time, length, surface area (2) probability is the same for all areas of opportunity (3) independent of # of events that occur in any other area (4) prob of 2< events occurring approaches 0 as opportunity becomes smaller
|
|
What is a discrete variable?
|
produces outcomes that come from a counting process
|
|
What is a continuous variable?
|
produces outcomes that come from a measuring process
|
|
Examples of discrete random variable
|
number of mortgages approved each week; how many credit cards in wallet; number of interruptions per day
|
|
What is a normal distribution?
|
most common continuous distribution used in statistics (Gaussian distribution)
|
|
What are characteristics of a normal distribution?
|
1)continuous variables common in business have distributions closely resembling the normal distribution (2) used to approximate various discrete probability distributions (3) provides basis for classical statistical inference b/c of relationship to Central Limit Theorem
|
|
What is the normal probability density function?
|
f(X) - math. expression representng a continuous probability density fucntion
|
|
What does the transformation formula do?
|
converts any normal random variable, X, to a standardized normal random variable, Z.
|
|
What is a uniform distribution?
|
value that has the same probability of occurrence anywhere in the range between the smallest value, a, and the largest value, b. Because of its shape is sometimes called the rectangular distribution
|
|
What is the exponential distribution?
|
continuous distribution that is right-skewed and ranges from zero to positive infinity
|
|
When is exponential distribution used?
|
waiting line (or queuing) theory to model the length of time between arrivals in processes such as customers at a bank's ATM, patients entering ER, hits on web site.
|