• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/28

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

28 Cards in this Set

  • Front
  • Back

what is a linear regression line

the line that summarizes a scatterplot by, on average, passing through the center of the Y scores at each X



-- the best fitting straight line

what is the linear regression procedure used for

to predict Y scores based on the scores from a correlated X

what is Y' and how do you obtain it

Y' = Y prime = the predicted Y score for a given X; computed from the regression equation

general form of a linear regression equation

Y' = bX + a

what does the Y intercept indicate

the value of Y when the regression line crosses the Y axis (X = 0)

what does the slope indicate

the direction and degree that the regression line is slanted

distinguish between the PREDICTOR variable and the CRITERION variable in linear regression

X = predictor variable = the given X variable



Y = criterion variable = the to-be-predicted Y variable

what is Sy'


the standard error of the estimate


what does Sy' tell you about the spread in the Y scores


it is a standard deviation, indicating the "average" amount that the Y scores deviate from their corresponding vales of Y'

why does Sy' tell you about your errors in predicition

indicates the average amount that the actual scores differ from the predicted Y' scores, so it is the "average" error

in order for the standard error of the estimate to accurate, there are 2 assumptions:

assumes Y scores are:



1) homoscedastic -- scores are equally spread out around the regression line at each X



2) normally distributed -- forming a normal distribution around the regression line at each X

how does heteroscedasticity lead to an inaccurate description of the data?

heterscedasticity means Y scores are not spread out from Y' to the same extent at all Xs, so the standard error of the estimate will over/under-estimate the error, depending on the value of X

how is the value of Sy' related to the size of r

Sy' is inversely related to the absolute value of r, because the smaller Sy' indicates the Y scores are closer to the regression line (and Y') at each X, which is what happens with a stronger relationship (a larger r)

when are multiple regression procedures used

when more than one predictor variable (X variable) is correlated with and used to predict the scores on one criterion variable (Y variable)

what are 2 statistical names for r^2

1) the coefficient of determination



2) the proportion of variance in Y that is accounted for by the relationship with X

how do you interpret r^2

indicates the proportional improvement in accuracy when using the relationship with X to predict Y scores, compared to using the overall mean of Y to predict Y scores

regression analysis is used when:

we have 2 variables and they are correlated



uses the correlation between 2 variables to predict unknown or future scores

the mean is the best predictor when...

we only have ONE variable and we want to predict an unknown score on that variable



--avg amount of error in prediction = standard deviation for that variable

accuracy of prediction ENTIRELY depends on

the correlation of your two variables



the close r is to +1 or -1, the greater the level of prediction



error in prediction

the distance from each real data point (y) to the predicted value (y')



best fitting regression line = smallest amount of total error

the strong the r (correlation), the _______ the error in prediction

smaller

2 elements describe the regression line

1) slope


2) y-intercept

if r = 0, your standard error of estimate = ?

the standard deviation of that variable


(this is no better than the mean)

assumptions of linear regression


(when its ok to perform linear regression)

1) linearity -- there is a linear relationship between variables



2) homoscedasticity -- occurs when Y scores are spread out to the same degree at every X



3) Y scores at each X form an approximately normal distribution

when we do NOT use the relationship to predict score

--use over mean of Y scores as everyone's predicted Y



--error is Y - (mean of Y)


error is Sy^2

when we DO use the relationship to predict scores

use the corresponding Y' as determined in by the linear regression equation as our predicted Y value



--error is (Y-Y')



error is Sy^2

proportion of variance accounted for is....

the proportional improvement in accuracy when using the relationship with X to predict Y, compared to using (mean of Y) to predict Y

how is proportion of variance accounted for computed

find r^2


"the coefficient of determination"