• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/99

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

99 Cards in this Set

  • Front
  • Back
regression line
-a straight line that describes how a response variable y changes as an explanatory variable x changes.
-used as a tool to predict the value of y given x
fitting a line
drawing a line that comes as close as possible to the points
equation for a regression line
y=a+bx

a=intercept
b=slope
slope
rate of change in y as x changes
extrapolation
the use of the regression line for prediction far outside the range of values of the explanatory variable
least squares regression line
+/-?
-error=observed gain -predicted gain
-it always passes through the point xbar, ybar
errors are positive if they lie above the line
negative below
equation for least squares regression
if a =ybar -bxbar
and b= =r (sy/sx)
then the least squares regression formula is:
yhat=a + bx
b=r (sy/sx)
a change of one standard deviationin x corresponds to a change of r standard deviations in y
r squared
squares of the correlation-the fraction of the variation in the values of y that is explained by the least squares regression of y on x
residuals
the difference between an observed value and the value predicted by the regression line
-the mean of the least squares residual is always 0
residuals formula
observed y-predicted y
y-yhat
outlier
an observation that lies outside the overall pattern of the other observations
influential
an observation is ____ for a statistical calculation if removing it would markedly change the result of the calculation
lurking variable
a variable that is not among the explanatory or response variable in the study and yet may influence
r square in regression
the square of the correlation r square, is the fraction of the variation in the values of y that is explained by the least squares regression of y on x
r=0
means that there is no linear function and that it is undefined
ybar
average
(response variable+rv+rv+rv)/# of subjects)
x
-always on horizontal axis
-explanatory variable
explanatory variable
explains or causes changes in response variables
y
-always on the vertical axis
-response variable
response variable
measures an outcome
total sum of squares
-square the deviation for each subject and sum the squares around the mean ybar
-SST
formula for SST
=∑(y-ybar)squared
sum of squares of the errors
-SSE
-(y-yhat)squared
yhat
x-1
sum of errors
always equals zero
r
correlation coefficient
correlation
-r is the slope of the least squares regression line when we measure both x and y in standardized units
-makes no distinction between explanatory and response variables
-requires that both variables be quantitative
-always satisfies -1≤r≤1
regression
the square of the correlation r is the fraction of the variance of one variable that is explained by least squares regression on the other variable
SRS
simple random sample
-each unit has the same chance of being chosen-unbiased
-selection of one unit has no influence on the selction of another
sample
-A sample is any subset of elements selected from the population.
-Subset of the a population. Group of creatures from which one gathers data with the intention of making inferences to all organisms that fit those criteria (i.e., the population).
stratified sampling
the population is divided into homogenous groups
cluster sampling
the population is grouped into small clusters and an srs of clusters is drawn

all objects in the selected cluster are observed
systemic sampling
randomly choose a unit from a population, then select every kth unit thereafter
multistage sampling
the sampling is chosen in stages. most opinion polls and other national samples use this method
experiment
-delibrately imposes some treatment on individuals in order to observe their responses
-an act or process that leads to a single result or outcome that cannot be predicted
observational study
observes individuals and measures variables of interest but does not attempt to influence the responses
explanatory variable
attempts to explain or is purported to cause differences in a response variable
experimental units
the individuals on which the experiment is performed
subjects
if human, this is what an experimental unit is called
treatment
a specific experimental condition applied to the units
three basic principles of experimental design
replication
control
randomization
replication
the same treatments are assigned to different experimental units
control
any method that accounts for and reduces natural variability
confounding variable
one whose effects on the response are indistiguishable from those of the explanatory variable
comparison
a form of control where 2 or more treatments are compared to prevent confounding the effect of a treatment with other variables
blocks
a form of control where similarexperimental units are placed into groups
randomization
the use of chance to divide experimental units into groups
double blind
those that read the results are also unaware of the treatment
calculator key strokes to generate a simple random sample
clear lists
highlight list 1
f4:calc
4:probability
a:random seed __
enter
highlight list 1
f4:calc
4:probability
5:randint(1, population size, number of samples needed)
the logic of randomization
-produces two groups of similar subjects before treatment
-comparative design ensures that outside variables influence equally
-therefore the only reason for the data outcome is the response variable
calculator key syrokes for least squares regression
-enter data for list1 and list2
-f4:calc
-3:regressions
-1:linreg(a+bx)
-x:list1
-y:list2
-store reqeqn: y1(x)
-enter
-enter
statistically significant
an observed effect so large that it would rarely occurby chance
r values
-does not change when we change the units of measure of either x or y or both
- -1≤r≤1
- +r indicates a positive association between the variables
- -r indicates a negative association
- near 0 indicate a weak linear relationship
-strongly affected by outliers
sentence for the interpretation of a regression line
"in our example, the slope means that the mean reaction time is increasing at a rate of .7 seconds per percent over the sampled range of drug amounts from 1% to 5%
voluntary response sample
people who choose themselves by responding to a general appeal-these are usually biased
random
we call a phenomenon random if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large # of repititions
-a kind of order that only occurs in the long run
probability
the proportion of times the outcome of a random phenomenon would occur in a very long series of repititions
event
a specific collection of one or more outcomes
sample space
the collection of all possible outcomes of an experiment
equally likely outcomes
if a random phenomenon has k possible outcomes, all equally likely, then each individual outcome has probability 1/k. the probability of any event A is:
P(A)= outcomes in A/sample space
=outcomes in A/k
probability rule #1
the probability of P(A) of any event A satisfies 0≤P(A)≤1
probability rule #2
if S is the sample space in a probability model, then P(S)=1
probability rule #3
the addition rule for disjoint events
two events A and B are disjoint if they have no outcomes in common and so can never occur together. if A and B are disjoint
P(A or B) = P(A) + P(B)
probability rule #4
the complement rule
the complement of any event A is the event that A does not occur, written as Ac. the complement rule states
P(Ac)=1-P(A)
probability rule #5
the multiplication rule for independent events
two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. if A and B are independent then:
P(A and B)=P(A)P(B)
probability rule #6
the general addition rule for unions of two events
for any two events A and B
P(A or B) = P(A) +P(B) - P(A and B)
probability rule #7
multiplication rule
the probability that both of the two events A and B happen together can be found by P(A and B) = P(A)P(B I A)
here P(B I A) is the conditional probability that B occurs given the information that A occurs
venn diagram
a picture that shows the sample space S as a rectangular area and events as areas within S
contingency table
cross tabulation tables
present frequency counts for combinations of 2 or more variables
marginal probabilities
probabilities of single events
joint probabilities
P(A and B)
conditional probability
when P(A)>0, the conditional probability of B given A is
P(B I A) = P(A and B) ÷ P(A)
union probabilities
P(A or B)
intersection
the intersection of events is the event that all of the events occur
tree diagram
use a table instead
baye's rule
A1, A2, ..., Ak are disjoint events whose probabilities are not 0 and add to 1

P(Ai l C) = [P(C l Ai)P(Ai)] / [P(C l A1)P(A1) + ... + P(Ak)P(C l Ak)

C is another event whose probability is not 0 or 1
independent events
two events A and B that both have positive probability are independent if
(PB I A)=P(B)
random variable
a variable that assumes numerical values associated with the random outcomes of an experiment, where one (and only one) numerical value is assigned to each outcome
discrete random variable
-a countable number of possible values

-expressed with a table

-a random variable that may assume either a finite number of values or an infinite sequence of values (number of units sold, customers who enter a bank in one day)
continuous random variable
random variables such as weight and time, which may take on all values in a certain interval or collection of intervals
-uses density curves
probability distribution
of a discrete random variable lists the possible values and their probabilities
two requirements for probabilities of discrete random variable
-every probability pi is a number between 0 and 1
-p1+p2+p3...+pk = 1
(also no numbers can be negative)
≤ vs< in discrete random variable
they are not the same
to find the mean of a discrete random variable
E(X)=µx=∑x p(x)
to find the variance of a discrete random variable
Var(X)=sigma2/x = ∑ (x-µx)squared p(x)
to find the standard deviation of a discrete random variable
SD(X)=sigmax=square root of the variance
to find mean, variance and standard deviation of a discrete random variable on the calculator
enter values into list1, list2
f4:calc
1:1var stats
list: list1
freq: list2
enter

on the calc
µx=xbar
∑(x-xbar)squared=sigma squared x = variance
continuous random variable
takes on all values in an interval of numbers
probability distribution of a continuous random variable
-described by a density curve
-is always on or above the horizontal axis
-has an area exactly 1 underneath it
uniform distribution
-shape of a box
-find height by multiplying the interval along the x axis by something to equal 0
(interval between 2 and 6 = 4
4 x __ = 1
___=1/4=.25
the probability density function
f(X) = 1/4, 2≤x≤6
0, elsewhere
≤ vs<
for continuous random variables these are =
µx on a uniform distribution
lowest x + highest x ÷ 2 = mean
percentage from a normal distribution
from the question find:
x=what is varying
standard deviation
mean
example:
P(x≥80)=nmcdf(80, ∞, 75, 7.5)
specific number from a percentage on a normal distribution
P(x≤k)=.98
k=invnm=.98, 75, 7.5
rules for means
rule 1: if X is a random variable and a and b are fixed numbers then
µ a+bx=a+bµx

rule 2:if X and Y are random variables then
µx+y = µx + µy
µx-y = µx + µy
rules for variances
rule 1: if X is a random variable and a and b are fixed numbers then
sigma squared a=bx=bsquared sigmasquaredx
or
sigma a+b=bsigmax

rule 2: if X and Y are independent random variables then
sigmasquared x+y = sigma squaredx +sigmasquaredy

rule 1: if X is a random variable and a and b are fixed numbers then
sigma squared a=bx=bsquared sigmasquaredx
or
sigma a+b=bsigmax

rule 2: if X and Y are independent random variables then
sigmasquared x-y = sigma squaredx +sigmasquaredy
the law of large numbers
draw independent observations at random from any population with finite mean. as the number of observations are drawn increases, the mean of the observed values eventually approaches the mean of the population as closely as you specified and then stays that close