Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
186 Cards in this Set
- Front
- Back
Experiment
|
you impose the treatment
-can influence others opinion of product |
|
blind experiment
|
only researchers know what is given, not experimentors
|
|
double blind experiment
|
noone knows what is given
|
|
observation
|
examines an experiment doesn not influence
|
|
purpose of experiment is to
|
observe whether a treatment cuases a change
|
|
counfounding variables when effects of variable
|
cannot be distinguished
|
|
confounded
|
other variables that interfere with outcome
|
|
subjects
|
people who are examined in an experiment
|
|
experimental units
|
non alive subjects
|
|
factors
|
impacts the resposne
|
|
treatment
|
ay specific thing distributed to the subjects
|
|
if more than one treatment
|
treatment is the varius combinations of the factors
|
|
completely randomized design
|
half/ half not counting for any factors
|
|
must satisfy 3 conditions to have a good design
|
1. control affets of lurking variables by making two treatments
2. must randomize 3. use enough subjects to reduce the change variation (big enough sample) |
|
statistically significant
|
so large it would rarely occur by chance
|
|
block
|
known prior to experiment that subjects are similar
-seperate -ex. females seperated from men - first block then random assignment |
|
survey is
|
observational study
|
|
experiments
|
actively impose treamtnet
|
|
variables
|
confouded when treatements are composed fo factors
|
|
3 traits of good experiment
|
1. control
2. randomization 3. using enough subjects to reduce chance |
|
placebo
|
creates control
|
|
randomization
|
uses chance and comparison prevents bias
|
|
bias
|
systematically favors certain outcomes
-people who respond are systematically different from those who didnt respond |
|
good experiment requires
|
good attention to detail
-lack of realism can prevent bias |
|
to get random numbers
|
1. number subjects or experimental units in any way
2. pick random numbers and assign to group A on calculator= math-prob-5(random integer) |
|
to see if two numbers are alike
|
do random assign so
math-prob-5 then assign to L1 then to put in order stat-sort D-L1 then see if two numbers at least appear |
|
simple random sample SRS
|
any sample of size n has the same chance of being selected as any other sample of size n
ex. any 6 people have the same chance of any other 6 people |
|
stratified random sample
|
1st divide population into groups called strada
2nd randomly select from stradas those selected is the sample ex. frshman, soph= strada -good because diverse(at least one from each group) -usually has proportiaonal representation |
|
this type of random sample is good because it is diversem includes at least one representation from each group, and has proportional representation
|
stratified random sample
|
|
cluster random sample
|
1st divide population into groups called clusters
2nd randomly select some of the clusters. we collect data on all members of clusters chosen |
|
x bar
|
mean height
|
|
mxbar = mx = m
|
true
|
|
sxbar=
|
sx/squareroot n
|
|
if a variable x is normally distributed with mean mx=m and standard deviation sx=s then fro any sample size n
|
the distribution of the sample means will be exactly NORMAL with mx=m and sxbar= sx/squareroot n
|
|
central limit theorem
|
regardless of the shape of the distribution of x, if x hsa a mean mx=m and standard deviation sx=s. if n is larger or equal to 30 then the shape of the distribution of x bar is approximately NORMAL with mxbar=mx=m and sxbar= sx/squareroot of n
|
|
when sample size is not normally distributed
|
sample size matters
- must be equal to or larger than 30 |
|
when normally distributed sample size
|
does not matter
|
|
find the probability that the mean weight of 34 randomly selected boxes is less than 5 lbs
|
normcdf(minimum,max,mean,sx/squareroot n)
|
|
random selected whatever =
|
x
|
|
find p that a randomly selected box (x) weighs between 4.8 lbs and 5.0 lbs in a normal distribution
|
normcdf(min,max,mean,sx)
|
|
mean whatever=
|
xbar
|
|
when x just use
|
sx
|
|
when xbar use
|
sx/squareroot n
|
|
p that the mean weight (xbar) of 15 randomly selected boxes is between 4.8lbs and 5.0 lbs
|
normcdf(min,max,mean, sx/squareroot n)
|
|
when randomly selected (x)
|
use sx
|
|
when mean (xbar)
|
use s/squareroot n
|
|
random sample of 40 boxes which mean weight corresponds to the 80th percentile of the mean weights
|
invnorm(.8,4.9,1.2/squareroot 40)
invnorm(area, mean, sx or sx/squareroot n) |
|
zbar=
|
xbar-mxbar / sxbar
|
|
confidence interval formula
|
xbar - zstar(sxbar) < m< xbar + zstar(sxbar)
or xbar+/- zstar(ME) |
|
99% confidence interval zstar number is
|
invnorm(.995,0,1)=2.57
|
|
84% CI zstar number is
|
invnorm (.92,0,1)= 1.41
|
|
line perpendicular and touching main line are
|
capturing true mean aka part of the confidence interval
|
|
if more than one treatment |
treatment is the varius combinations of the factors |
|
block |
first block hen random assignment |
|
variables are confouned when |
treatments are composed of factors |
|
the advantage of a stratified random sample ofver that of a luster smaple is tht |
stratified sampling has better chances of diversity because it guarantees at least some representation of both stratas |
|
when multiple treatments and describing a completely randomized design |
catagorize the treatments into strata of the potential treatments, each treatment is represented by x amount of whatever per treatment to ensure diversity this cancels out variablitity |
|
one statistical advantage to having a controlled sample tested is that |
keeps variability confounded - since there is less caraition in one type of species alone as compared to all types of species, we use only only one type of species in our sxperiment, we are eliminating many possible confounding variables that might be related to the type of species. |
|
one statistical disadvantage to having a controled sample tested is that |
having only one type of species may not represent all types of species. not diverse enough of a sample, cannot generalize |
|
randomization |
hopes to minimize variation among response variable even more |
|
make blocks represent all confounding factors ex north, south, nw, sw, ne, se/ northern outer corners, northern inner, southern outer, southern inner |
then ranonly assign treatments to each block then compare response variable levels between 2 treatments within each block |
|
rainboot experiment should be |
complete reandom sample that is double blinded to eliminate bias and confounding factors |
|
factor = |
x |
|
response variable = |
y |
|
s(xbar) = |
sx/ square root n |
|
is x is normal then sample means will be |
exactly normal with sxbar= sx/ squareroot n |
|
when p(x<#) AND IS NORMAL BY EXACT OR APPROXIMATE |
normcdf(-100000,#, mx, sx)
|
|
when find probability of A or ONE(x) something |
make sure it is normal by either exact or clt then normcdf(small number, bigger number, mean, sx |
|
when find probability of #(x bar) of something occuring |
make sure normal then normcdf ( small number, bigger number, mean, sx/square root n) |
|
when randomly selected x use |
S |
|
when mean weight xbar use |
s/square root n |
|
when asks what weight corresponds with # percentile |
make sure it is normal then use invnorm |
|
Z= |
xbar - mx / (s/sqaure root n) |
|
to get z number |
get left number then goto invnorm ( left number, 0, 1) |
|
m = |
population area |
|
s |
population standard deviation
|
|
c |
population linear correlation |
|
intervals that are perpendicular capture |
true mean |
|
Z* confidence interval formula |
x bar +/- Z*(s/squareroot n) < M |
|
x bar - Z*(s/squareroot n) < M< x bar + Z*(s/squareroot n) |
z confidence interval formula |
|
Z * value |
either get invnorm ( left amount then 0,1) |
|
z* of 84% CI |
invnorm ( .92, 0, 1) |
|
3 steps for finding a confidence interval estimate |
1. check to make sure all conditions are met and state them show the confidence interval formula and evaluate the confidence interval estimate 3. interpret the confidence interval in context |
|
confidence interval estimate for M, and S is known |
z confidence interval |
|
z confidence interval conditions |
1. sample must be SRS of the population 2. x is normally distributed or n>/= 30 3. s is known 4. population is at least 10x the sample size |
|
when puttin z interval straight into calculator |
for sigma put just S not s/square root n - IT DOES IT FOR YOU |
|
to do z interval on calculator do |
stat-test- zint |
|
when you have interval and need to find xbar or mean |
(#1 + #2)/ 2 = Xbar |
|
when you have interval and need to find ME |
(#2-#1)/2 =ME |
|
when asks for minimum sample size needed |
n= (z*)^2(S)^2 / (me)^2 -ROUND UP end number |
|
mean is |
unbiased |
|
when have interval and asks for X bar and S |
get x bar (add both divide by 2) and ME, then ME= (z*)(s/square root n) |
|
when s is present use |
Z* |
|
z confidence intervals are about |
M not xbar |
|
hypothesis test is never about |
hypothesis statistic Xbar |
|
hypotheiss test is always about |
m meu |
|
T* if |
dont know s |
|
z* and t* |
estimate M |
|
to get t* |
invt(area to left, n-1) |
|
z* |
invnorm |
|
T* |
invT |
|
if given population standard deviation then use |
Z* |
|
T* |
just group |
|
when asks for the best point estimate of the mean |
it is mean stated in equation xbar |
|
T procedure is robust ( can handle) against mild skew |
true |
|
T over z b/c |
T knows sigma |
|
when an intervl includes 0 |
it is ineffective |
|
an interval is ineffective if |
it includes 0 |
|
if infor came from a normal distribution |
- no outliers - do stemplot to see if roughly normal - enter data ito a list and run normal probablity plot on data - if roughly linear, data was approximately normal TO DO THIS enter into list then 2nd- y= choose last graph then zoom |
|
when population is not normal and n is not bigger than 30 but we know n |
can still be found out through doing a stemplot on calculator |
|
P= |
true unknown population proportion
|
|
P hat = |
sample proportion |
|
x = |
number of succession |
|
Phat= |
x/n |
|
P interval conditions |
1. sample must be SRS 2. n(p hat) >/= 10 and n(1-p hat) >/= 10 3. population is at least 10 x sample size |
|
P confidence interval equation |
phat -/+ Z* (square root phat ( 1-p hat ) / n ) |
|
phat -/+ Z* (square root phat ( 1-p hat ) / n ) |
P interval equation |
|
ME or p interval |
Z* (square root phat ( 1-p hat ) / n ) |
|
interpretation of P confidence interval |
we are #% confident that the true unkown population proportion P is between # and #. |
|
p hat is unbiased estimate of |
p |
|
P 's unbiased estimate is |
p hat |
|
x bar is unbiased estimate of |
M |
|
M meu 's unbiased estimate is |
x bar |
|
Standard deviation of p hat (S p hat) is approximately equal to |
SE p hat or same but instead of P its p hat |
|
to do P interval on calculator |
STAT-TEST- A |
|
when you know SAMPLE standard deviation use |
z* |
|
when you know standard deviation use |
t* |
|
when entering confidence level into a calculator for the test calcluations |
keep confidence level untainted from what it states in question |
|
when we need to estimate how large a sample should be but dont have S |
assume it equals .5 or half so it would be n= z sqaured times .5 squared over me squared |
|
area is different from |
confidence interval |
|
when want to find n or minimum people in survey |
use z* |
|
when we want to estimate what minimum size n should be or what size n should be and there is no s |
ASSUME IT IS .5 SO SQUARE .5 |
|
when asked for minimal sample size of p hat |
n = (z* )^2 ( square root p hat (1- p hat)^2 / (ME)^2 |
|
if no S stated to find minimum sample size assume it is |
.5 |
|
WHEN WE HAVE POPULATION STANDARD DEVIATION |
USE Z* |
|
a estimate is plausible if it |
is within the interval you received |
|
estimate is not plausible if |
it is NOT within the #% confidence level interval |
|
when we know A (X) mean and standard deviation and asks for mean and standard deviation of sample distribution |
mean is same and standard deviation of sample distribution is s/ square root n |
|
remember |
when labeling x if it is mean use x bar if just one or a put x |
|
whenever an interval make sure |
do conditions do interval explin interval |
|
to find margin of error of interval |
subtract smaller from larger number then divide by 2 |
|
we reduce the margin of error by |
making a bigger sample size, and or less confidence level |
|
dont round up |
z or t numbers |
|
if n is greter than or equal to 30 then the shape of the distrivution of x bar is approximately normal with mean being the same and s x bar = s x over square root n |
Central Limit Theorem |
|
if no N assume p hat worst case scario |
.5 |
|
meaning of a 90% confidence interval for mean M |
if we constructed all possible 90% confidence intervals for all possible samples of siz n of this population 90% of these confidence intervals would contain the true mean M |
|
to find S from a t or z confidence interval |
x bar +/- z or t star (s/ square root n) |
|
confidence interval does not mean |
90 % chance that m or p is in confidence interval |
|
if we did all confidence intervals for all samples we are #% confidence that M falls in these |
explanation of confidence interval |
|
when you know sample standard deviation S |
use T int |
|
population proportion interval |
is obtained from one random sample |
|
for sure the sample proportion is in an |
population proportion interval |
|
in a population proportion interval |
for sure the population proportion is in this interval |
|
both sample and population proportion are in |
the interval for a population proportion |
|
minimum sample size problems always use |
z* |
|
the shape of the sampling distribution of p hat is approximately normal if |
N(p hat) is greater than or equal to 10 and N ( 1- p hat ) is greater than or equal to 10 |
|
the shape of the sampling distribution of x bar is exactly normal if |
it is clearly stated |
|
the shape of the sampling distribution of x bar is approximately normal if |
n is greater than or equal to 30 |
|
x= |
number of successes |
|
n= |
sample size
|
|
p hat = |
sample proportion= x/n |
|
P= |
true unknown population proporiton |
|
x bar= |
sample mean |
|
S= |
sample standard deviation |
|
sigma = |
population standard deviation |
|
Meu= |
unknown true population mean |
|
mean = |
mx |
|
Meu p hat = |
P |
|
Sigma P hat= |
square root P (1-P) / N which is approximately equal to SE p hat= same but with p hats instead of P |
|
sigma x bar= |
sx/ sr n which is approximately equal to s/ sr n |
|
SE p hat= |
standard error of the sample proportion which is the approximate of Sigma P hat |
|
SE |
standard error |
|
SE x bar= |
standard error of the sample means which is the approximate of sigma x bar |
|
when have #% confidence interval for T interval and it asks for S |
ME= T* ( s/sr n) |
|
ME= T* ( s/sr n) |
when have #% confidence interval for T interval and it asks for S |
|
as sample size increases, what happens to the margin of error |
the margin of error decreases |
|
as the confidence level increases, what happens to the margin of error |
the margin of error increases |
|
is the interval wider or narrower when we have more confidence |
wider |
|
a confidence interval for population mean always includes |
x bar, sample mean |
|
a confidnence interval for a population proportion always incldues |
p hat, sample proportion |
|
expalin the meaning of a 99% confidence interval for a population mean |
if we constructed all possible 99% confidnece intervals of sample n in population, 99% of their confidence intervals would contain the true mean meu |
|
central limit theorm |
when n is greater than or equal to 30 x bar is approximately normal with mxbar= m and s xbar = s/sr n |
|
a paramater |
explains the whole population not just sample population or proportion of whole population |
|
law of large numbers |
if you doi something more and more times at random your average will get closer to the mean |
|
if it asks for the level that there is # probability |
find z * for probability EXACTLY then put mean + z* ( S/ SR n) |
|
we prefer the t procedures to the z procedures for inference bout a populaiton mean because |
z requires that you know the population standard deviation |