• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/39

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

39 Cards in this Set

  • Front
  • Back
The response variable
the variable whose value can be explained by the value of the explanatory or predictor variable.
scatter diagram
is a graph that shows the relationship between two quantitive variables measured on the same individual. Each individual in the data set is represented by a paint in the scatter diagram. The explanatory variable is plotted on the horizontal axis, and the response variable is plotted on the vertical axis.
linear correlation coefficient OR Pearson product moment correlation coefficient
is a measure of the strength and direction of the linear relation between two quanitive values. The Greek letter p (rho) represents the population correlation coefficient, and r represents the sample correlation coefficient. We present only the formula of the sample correlation coefficient.
The linear correlation coefficient is always between
-1 and 1 inclusive. That is -1 ≤ r ≤1
if r = +1
then a perfect positive linear relation exists between the two variables
if r = -1
the a perfect negative linear relation exists between the two variables
if r is close to 0
then little or no evidence exists of a linear relation between the two variables. SO r CLOSE TO 0 DOES NOT IMPLY NO RELATION, JUST NO LINEAR RELATION.
The linear correlation coefficient is a unit less measure of association. So
so the unit of measure for x and y plays no role in the interpretation of r.
The correlation coefficient is
not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient.
positively associated
two variables that are linearly related are positively associated. That is two variables are positively associated if, whenever the value of one variable increases, the value of the other variable also increases.
negatively associated
two variables are negatively associated if. whenever the value of one variable increases, the value of the other variable decreases.
correlation matrix
excel provides it. for every pair of columns in the spreadsheet it will compute and display the correlation in the bottom triangle of the matrix.
How do you test for a linear relation
1. test for the absolute value of the correlation coefficient.
2. Find the critical value in table 2 from appendix A
3. If the absolute value of the correlation coefficient is greater than the critical value, we say a linear relation exists between the two variables. Otherwise no linear relation exists.

* if the correlation coefficient is positive & greater than the critical value, the variables are positively associated. If the correlation coefficient is negative and less than the opposite of the critical value, the variables are negatively associated.
The linear correlation coefficient that implies strong positive or negative association does not imply causation if
it was computed using observational data. Why? lurking variables

i.e. as air-conditioning bills increase so does crime rate.
xbar
sample mean of the explanatory variable
sx
sample standard deviation of the explanatory variable
Ybar
sample mean of the response variable
Sy
sample standard deviation of the response variable
the closer r is to +1
the stronger the evidence of positive association
The closer r is to -1
the stronger the evidence of negative association between two variables
residual
residual represents how close our prediction comes to the actual observation. The smaller the residual the better the prediction.
residual= observed y - predicted y
slope (m)
y₂- y₁ ÷ x₂ - x₁ or rise/run or change in y ÷ change in x
slope point formula
y - y₁ = m (x - x₁)
least-squares regression line
the line that minimizes the sum of the squared errors (or residuals). This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by y-hat. We represent this as "minimize ∑residuals²"
The equation of the least-squares regression line is given by
y hat= b₁x + b₀
a good fit
means that the line drawn appears to describe the relation between two variables well
the slope of the least squares regression line
b₁ = r × (sy ÷ sx) AND b₀ = ybar - b₁xbar is the y-intercept of the least squares regression line

note: xbar is the sample mean and sx is the sample standard deviation of response variable y
in a TI-84 the slope is
a
in a TI-84 the y-intercept is
b
Be careful when using the least-squares regression line to make predictions that are
much larger or much smaller than those observed
if the linear correlation between two variables is negative
the slope of the regression line will also be negative
the leas squares aggression line always travels through the point (xbar, ybar)
the point (xbar, ybar)
residual =
observed y- predeicted y= y-yhat
coefficient of determination (R²)
measures the proportion of total variation in the response variable that is explained by the least-squares regression line

in other words, the coefficient of determination is a measure of how well the least-squares regression line describes the relation between the explanatory and response variables. The closer R² is to 1, the better the line describes how changes in the explanatory variable affect the value of the response variable.
total deviation
the deviation between the observed and mean values of the response variable is called the total deviation. So total deviation = y - ybar

total deviation = unexplained deviation + explained deviation
explained deviation
The deviation between predicted and mean values of the response variable is called the explained deviation.

the explained deviation = yhat- ybar
unexplained deviation
deviation between observed and predicted values is the unexplained deviation

unexplained deviation = y-yhat

unexplained variation is also found by summing the squares of the residuals
residual plot
a scatter diagram with the residuals on the vertical axis and the explanatory variable on the horizontal axis.
Constant error variance aka homoscedasticity
if a plot of the residuals agains the explanatory variables shows the spread of the residuals increasing or decreasing as the explanatory variable increases, then a strict requirement of the linear model is violated.