Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Hint

Related Flashcards

Flashcards
»
Quantitative Methods - CFA Level II CLONED

Quantitative Methods - Cfa Level Ii Cloned

by juniperkim, Jan. 2014

Subjects: methods quantitative

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/50

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

50 Cards in this Set

Front
Back
3rd side (hint)

Sample Covariance	the average value of the product of the deviations of observations on two random variables from their sample means
Sample Correlation	the measure of linear association; how closely related two data series are r = 1; perfect positive correlation; 0 no linear correlation; =-1 perfect negative correlation
Limitations of Correlation Analysis	a. Outliers = small numbers of observations at either extreme (small or large) b. Spurious correlation – correlation between two variables that reflects chance relationship; correlation induced by a calculation; correlation between two variables arising from their relation to a third variable
Dependent Variable	the "Y" in linear regression "the variable you are seeking to explain"
Independent Variable	the "X" in linear regression "the variable you are using to explain changes in the dependent variable"
Assumptions of Linear Regression	a. The relationship between X and Y is linear within the parameters of bo and b1 (raised only to the first power) b. The independent variable, X, is not random c. The expected valuation of the error term, E, is zero d. The variance of the error term is the same for all observations e. The error term, E, is uncorrelated across observations f. The error term, E, is normally distributed
Standard Error of Estimate (SEE)	determines how well a linear regression model captures the relationship between the dependent and independent variables; how certain we can be about a particular prediction of Y using the regression equation
Coefficient of determination	how well the independent variable explains the variation in the dependent variable; the fraction of the total variation in the dependent variable that is explained by the independent variable when k = 1, R^2 where R^2 = explained variation / total variation
Confidence Interval	an interval of values that is believed to include the true parameter value with a given degree of confidence
Type 1 Error	the chance of rejecting the null hypothesis when, in fact, it is true
Type 2 error	failing to reject the null hypothesis when, in fact, it is false
Analysis of Variance (ANOVA) in regression analysis	a statistical procedure for dividing the total variability of a variable into components that can be attributed to different sources the usefulness of the independent variable or variables explaining variation in the dependent variable
Limitations of regression analysis	a. Regression relations can change over time (just like correlations) = called parameter instability b. Public knowledge of these regression relationships may negate future usefulness in the market c. If regression assumptions are violated, predictions based on linear regression may not be valid
Assumptions of multiple regression model	a. Linear relationship between the dependent variable, Y and the independent variables X1, X2 Xn b. Independent variables are not random c. The expected vale of the error term is 0 d. The error term is uncorrelated across observations e. The error term is normally distributed
F-Test interpretation	tests the regression's overall significance
R squared vs. Adjusted R squared	R squared = the goodness of fit of the model Adjusted R squared = needed with multiple independent variables, because it doesn't automatically increase upon the addition of another variable	Adjusted R squared = 1 - {(n-1)/(n-k-1) * (1 - R squared)}
Heteroskedasticity definition	when the assumption that variance of errors is constant, is violated (homoscedastic if the assumption is not constant)	a plot of data is heteroskedastic if: its variance from the line of fit differs at an increasing rate vs. a close fit to the line with homoscedasticity
Impact of Heteroskedasticity	will result in unreliable standard errors and therefore unreliable computed t-tests; often standard errors will be understated, resulting in inflated t-stats, and suggesting significance, when in fact there isn't significance	Test with Breusch and Pagan test = (n*R squared) compared to Chi squared at given significance level and df = to the number of independent variables Correct with robust standard errors of generalized least squares
Serial Correlation (autocorrelated)	when regression errors are correlated across observations positive serial correlation is when a positive error for one observation increases the chance of a positive error for another observation	Test for Serial Correlation with the durbin watson test
Interpretation of DW (Durbin-Watson) test	if the regression has no serial correlation: DW stat =2 If the regression residuals are positively serially correlated, DW is less than 2 If negatively serially correlated, than DW > 2 Inconclusive if DW stat lies between dl and du range	DW stat = 2(1-r) Correct for serial correlation by adjusting the coefficient standard errors via Hansen's method (does not remove completely, but diminishes its impact)
Multicollinearity definition	occurs when two or more independent variables are highly, but not perfectly correlated with each other making the interpretation of the regression output problematic – regression coefficients become imprecise and unreliable – cannot distinguish the individual impacts of the independent variables on the dependent variable
Detecting Multicollinearity	A high R squared and a significant F-stat when t-stats are NOT significant is an indication of multicollinearity	Correct by excluding one or more of the regression (independent) variables
Model specification	refers to the set of variables included in the regression and the regression equation’s functional form
three types of model misspecification	1) misspecified fuctional form 2) regressors that are correlated with the error term 3) time-series misspecification: nonstationarity, random walks
Impact of misspecification	All misspecifications invalidate statistical inference, causing regression coefficients to be inconsistent
Time-Series misspecification: non-stationarity	when a variable’s properties (mean and variance), are not constant through time
Time-Series misspecification: random walks	time series for which the best predictor of next period’s value is this period’s value (when b1=1)
Probit model	is based on a normal distribution, estimating the probability that Y = 1	probit and logit models estimate the probability of a discrete outcome given the values of the independent variables used to explain the outcome
Logit model	is based on the logistic distribution	probit and logit models estimate the probability of a discrete outcome given the values of the independent variables used to explain the outcome
Functional form misspecification	omitting an important variable, may need to transform a variable, pooling data from samples that should not be pooled (can see this graphically)
Regressors that are correlated with the error term - misspecification	including a lagged dependent variable as an independent variable, including a function of a dependent variable as an independent variable, or independent variables measured with error
Limitation of trend models - time series, linear or log-linear	Regression error for one period must be uncorrelated with the regression error for all other periods this is the definition of serial correlation, which is the main limitation of time-series analysis
Covariance Stationary	a key assumption of time series model - states that properties, like the mean and variance, do not change over tim
Requirements for a times series to be covariance stationary	The expected value of the time series must be constant and finite in all periods The variance of the time series must be constant and finite in all periods The covariance of the time series with itself for a fixed number of periods in the past or future must be constant and finite in all periods	If a time series is not covariance stationary, the estimation results will have NO ECONOMIC MEANING
Autoregressive model (AR)	is a time series regressed on its own past values; Xt = bo + b1*(Xt-1) + e
How autocorrelation of the residuals can test whether an AR model fits the time series	a. Can test whether using the correct time-series model by testing whether the autocorrelations of the error term differ significantly from zero (are t-stats of the residuals significant?) b. If it does differ from zero, the model is not specified correctly, and can use the sample autocorrelation of the residuals and their sample variance to estimate error autocorrelation c. Standard error of the residuals = 1/square root of T (t is the # of observations in the time series)
Mean Reversion	a time series shows mean reversion if it tends to fall when its level is above its mean and rise when its level is below its mean	Xt = bo / (1-b1)
In-sample forecast errors	are the residuals from a fitted time-series model (within the time-frame specified)
Out-of-sample forecast errors	are the differences between the actual and predicted values – gives a sense for how well it will forecast in the future
Root mean squared error (RMSE)	the square root of the average squared error – the smallest RMSE the more accurate
Why coefficients of time-series models are unstable	a. Sample period used is crucial for appropriate statistical inference and forecasting accuracy b. Models are only valid if it is a covariance stationary time-series
Random Walk Process	a. Random walk is a time series in which the value of the series in one period is the value of the series in the precious period plus an unpredictable random error i. Xt = Xt-1 + E (all because bo = 0 and b1 = 1) ii. Random walk with a drift is when b1=1 and bo is not equal to 0
Random Walk "Cure"	With an undefined mean reverting level and no upper bound for variance (it grows with t), resulting in no finite variance, or a time series that is not covariance stationary – means you cannot use standard regression analysis with a random walk, instead need to convert the data to a covariance stationary time series by first differencing (yt = xt – xt-1)
Unit Root	If b1 = 1, the time-series has a unit root, is also by definition a random walk, and therefore not covariance stationary
Impact of a Unit Root	The presence of a unit root makes t-stats unreliable, instead use the Dickey Fuller test (subtracting xt-1 from each side of the equation) and then reevaluate the t-stats of residuals until the model is properly specified
Dickey & Fuller Test	first difference (subtract xt-1 from each side) until t-stats of the residuals are no longer significant adjusts/test ARs for unit roots
How to test and adjust for seasonality in time-series	Adjust the equation to factor in a seasonal component (the time period, t that is significant when others are not within AR), and first difference until the t-stats of the residuals are no long significant
autoregressive conditional heteroskedasticity (ARCH)	the presence of heteroskedasticity within a time-series, where variance of the error term is NOT constant and depends on the independent variable – this will cause the standard errors to be unreliable, and therefore t-stats may indicate significance, when in fact they are not TO CORRECT: will need to use generalized least squares to correct, or adjust the time period used	can detect if all the t-stats of a time-series are statistically significant
Analysis of time-series variables prior to use in a linear regression	Will first need to test for a unit root (dickey fuller test) for each of the time series i. If neither have a unit root can “safely” use a linear regression ii. If one of the two time series has a unit root, should not use linear regression iii. If both have a unit root and the time series is cointegrated, can use linear regression, if not cointegrated, should not use linear regression Dickey Fuller test is used to determine cointegration
Cointegration definition	two time series are cointegrated if a long-term financial or economic relationship exists between them such that they do not diverge from each other without bound in the long-run are cointegrated if they share a common trend