• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/26

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

26 Cards in this Set

  • Front
  • Back
Define regression
A method for determining the mathematical formula relating the variables
Define correlation
A method for determining the strength of the relationship between the variables.
What are applications for regression analysis?
Forecasting
Define cross-sectional data
the observations relate to different people at one point in time
Define time-series data
each observation relates to a different point in time
When the association between the variables is high with high and low with low, what type of correlation is this called?
Positive correlation
When the association between the variables is high with low and vice versa, what type of correlation is this?
Negative Correlation
Define simple linear regression
The task of finding the values of a and b which provide the best connection between the two variables.
Please explain the meaning of the constants in the equation:
y = a + bx
a is the intercept of the y axis

b is the slope of the line
A positive slope vs a negative slope - what does that mean for correlation?
A positive slope means there is a positive correlation.

A negative slope means there is a negative correlation.
Define residual
The difference between the actual and fitted values for y.
Simple linear regression is the process of
deciding which is the best straight line through a set of points, so that the residuals are as small as possible.
What possible approaches to deciding which is the best straight line through a set of points?
1. Make the best line for which sum of residuals is the least.
2. Make the sum of the absolute values of the residuals as small as possible.
3. (traditional) The least squares method.
Define the Least Squares method
the sum of the squared residuals is a minimum.
How is correlation measured?
By calculating the Correlation Coefficient - r. r can take on all values from -1 to +1. Close to -1 or +1 indicate a strong correlation. Close to 0 indicates a weak correlation.
Define the explained variation
The variation in y that is caused by x.
How is explained variation measured?
From the difference between the fitted y value and the average y value.
What is the essence of correlation?
when regression analysis is carried out, the variation in y is split in 2:
a. a part that is 'explained' by associating the y values with the x, and,
b. a part that is unexplained since the relationship is an approximate one and there are residuals
What are the general limits for correlation?
r > = .75 - highly satisfactory
r = .5 - .75 - adequate
r < .5 - serious doubts
What further tests must be made to confirm that the linear equation sufficiently describes the relationship between the variables?
1. a visual check of the randomness of the residuals, as plotted on a graph of the linear equation.
2. a scatter diagram of the residuals against the fitted y values
List the 4 steps in regression and correlation.
1. inspecting the scatter diagram
2. calculating the regression coefficients
3. calculating the correlation coefficients
4. checking the residuals for randomness
What is a serial correlation in the residual?
Occurs particularly in time-series data where there may be some time-related cycle.
Define heteroscedasticity
a tendency for residuals to vary in size at different parts of the line. Likely to occur in cross sectional data when the size of residuals is related to the x value.
Explain reservations about Regression and Correlation
1. Causality - largest source of confusion and error. While association can be determine by analysis, causality cannot be confirmed.
2. Spurious regressions - correlation coefficient is high, but no relationship. Error in setting up model.
3. Extrapolation - should be avoided (using equation outside range of data). Done in forecasting - but must be wary.
4. Regression to single sets of data, when perhaps two lines are more appropriate.
5. Least-squares criterion can be misleading by being too precise.
6. Least-squares has been applied to regressions of y on x.
Regression analysis is used ...
to include both regression and correlation
Forecasting is base on ...
regression analysis