Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Related Flashcards

Flashcards
»
Econometrics. Final

Econometrics. Final

by mcretsinger@hotmail.com, Nov. 2008

Subjects: fall-2008 good tags

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/91

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

91 Cards in this Set

Front
Back

	Special case, when the regressor is binary, ie, it can take only one of two variables Where Di is a dummy variable DI= 1 if STR < 20 DI= 0 if STR > 20	Regression when X is a binary variable
	Binary variables are also known as...	Indicator variables or dummy varibales
	Omitting variables generates a bias in OLS estimates if...	1) The omitted variable is a determinant of the dependable variable 2) The omitted variable is correlated with included regressor
	The estimators of B0, B1...Bk that minimize the sum of squared mistakes in expression are called...	Ordinary Least Squares
	Ideal randomized controlled experiments in economics are	Useful because they give a definition of a causal effect
	when the estimated slope coefficeint in the simple regression model B1, is zero,	R2 =0
	Which of the folliwng statments is true	TSS=ESS+SSR
	In the simple linear regression model, the regression slope	Indicates by how many units Y increases, given a one unit increase in X
	The OLS estimator is derived by	Minimizing the sum of squared residuals
	to obtain the slope estimator using the least squares principle, you divide the	Sample convariance of X and Y by the sample variance of X
	Interpreting the intercept in a sample regrssion function is	Reasonable if your sample contains values of X, around the origin
	In the sample linear regression model, Yi=Bo+B1X1+U1	B0+B1X1, represents the populaiton regression function
	The T-Statistic is caluculated by	The estimator minus its hypothesized value by the standard error of the estimator
	The construction of the t-statistic for a one and a two sided hypothesis	is the same
	The 95% CI for B0 is the interval	Provides the range of coefficeints that are likely to show up in 95% of all samples
	The OLS residuals, Ui, are defined as follows	Yi-Yi (there is dash above 2nd Y)
	All of the following are desirable properties of the OLS estimators except	The OLS estimators are not efficeint
	To test the hypothesis of that the intercept coefficient is zero against the alternatiave that is positive at the 1% level, one has to compare the T-Statistic against	The critical value of -2.33
	OVB can occur in regressions with a low R2, a moderate R2, or a high R2. Conversely a low R2 does not imply that there necessarily is OVB; or would be biased toward a negative number, p.191 1) The ommited variable is a deteminant of the dependant variable 2) The excluded variable has to correlated with the included regressor	Omitted Variable Bias
	1) U has to have a positive effect of Y and Pxu>0. Positive bias- Overestimate true relation 2) U has a positive effect on Y and Pxu<0. Negative bias-Undersestimate the relationship between Y+X 3) U has a negative effect on Y and PXU>0. Negative bias-Undersestimate true relationship effect of X on Y 4) U has a negative effect on Y and PXU<0 Positive Bias - Overestimate the true effect	4 Possible causes of Ommited variable bias
	1) Estimate the regression for subsets of districs with similar fractions of English learners. - Divide into quartiles, deciles and run regress for the quartiles/Hectiles. 2) Include % of English learners as an additional regressor in the model	How do we hold other things constant?
	Binary Variables...	Can take on only two values
	Most economic data are obtained	By observing real-world data
	Studying inflation in the US from 1970-2006 is an example of using	Time series data
	In the simple linear regression model, the regression slope	Indicates by how many units Y increases, given a one unit increases in X
	the regression R2 is a meausre of	Goodness of fit of your regression line
	to obtain the slope estimator using the least squares principle, you divide	Sample covariance of X and Y by the sample variance of X
	In the linear regression model, Yi=B0+B1X1+ui, B0+B1X1 is referred to as	The populaiton regression function
	If the absolute value of your caluculated t-statistic exceeds the critical value from the standard normal distribution, you can...	Reject the null hypothesis
	the error term is homoskedastic if	var(ui l X1=x) is constant for i=1,...n
	Imagine you regressed earnings of individuals on a constant, a binary variable ("Male) which takes on the value 1 for males and is 0 otherwise, and another binary variable ("Female") which takes on the value 1 for females and is 0 otherwise. Because females typlically earn less than males, you would expect	None of the OLS estimators to exist because there is perfect multicollinearity
	When there are ommited variables in the regression, which are determinants of the dependant varibale, then...	the OLS estimator is biased if the ommited varibale is correlated with included varibale
	For a single restriciton (q=1), the F-Statistic...	Is the square of the T-statistic
	All of the following are examples of joint hypothese on multiple regression coefficients, with the exception of...	H0: B1+B2=1
	3 Goodness of fit measures	1) SER=Su where SU=SSR/N-K-1 (SER) Standard error of regression:Estimates the spread of the distribution of YI around the regresson line 2) R2=ESS/TSS=1-SSR/TSS R2 - Estimates the fraction of the sample variance of Yi that is explained byregressors 3) Adjusted R2-R2 = 1-N-1/N-K-1 SSR/TSS Adjusted R2 estimates the fraction fo the variance of Y explained by the regressor but correcting/adjusting by a factor that takes into account the number of regressors included R2=1-N-1/N-K-1 x SSR/TSS
	Imperfect multicollinearity arises when....	2 or more variables are highly correlated coefficents will less precise, i.e SE is large Becomes difficult to do hypothesis tests and estimate confidence intervals
	Imperfect multicollinearity doens't generate a problem for the thoery of OLS estimators, but...	the coefficents will be imprecisely estimated
	Hypothesis testing for a single coefficent in a multiple regression	(i) Calculate SE(Bk) (ii) Calcualte T-actual B0-B1/SE (Bk) (iii) Compare t-actual with critical value
	Learing methods to estimate non-linear functions	1) The first group of methods will allow us to address the effect on Y, of a change in one independant variable, X, when the effect depends on the value of X itself. 2) the second group of methods is useful when considering the effect on Y of a unit change in X, depends on another indendant varible, eg. x2
	Quadratic Regrssion model	Test score = Bo+B1, Incomei+B2Income 2/1+Ui - Approxiamtes a curve - B0, B1, & B2 ar eunknown and must be estimated a sample of data - OLS and methods are the same, since this is simply a variant of the multiple regression model
	Quadratic Regression Model...	Test Score=B0+B1 incomei+B2 Income i+Ui (see notes)
	1) Identify possible non-linear relationships 2) Specify non-linear function and estimate using OLS 3) Determine whether non-linear model improves over linear model-use t-stats of F-stats 4) Plot nonlinear regression 5) Estimate effect on Y of change in X	How do we go about modeling non-linearities?
	1) Functions involving polynomials or polynomial regression 2) Functions involving logarithms	Aside from a quadratic model, other possible non-linear functions
	Yi = β0+β1X1+β2Xi2+β3Xi3 +…βrX1r + Ui where r denotes the highest power of x included in the regression. Sequential hypothesis testing to decide how many polynomials to include If r=2 Quadratic regression model R=3 Cubic regression model	Polynomial Regression Model
	How do we test the linear against polynomial model	H0: B2=0, B3=0, Br=0 H1: one or more of the coefficents is different from 0
	Methods to identify effect on Y of a change in X1, when the effect depends on value of X1 itself	a) Polynomial functions b) Functions that depend on logarithms
	Another wayof specifying a nonlinear regression funciton is to use natural logarithm of Y and or X. Logarithms convert changes of variables into percent changes	Logarithm functions
	The natural logarithm is written as "ln" and it is the inverse of the....	Exponential function
	3 Types of logarithmic regression models	(i) Linear-log model: A change in X1 generates a change in Y of 0, 01* B1 (ii) Log-linear: A change of a 1 unot in X generates a change in Y of 100 * B1% (iii) Log-log model: A change in 1% of X1 generates a change in Y of B1 %
	Effect of Y of change in X, depends on value of X1 a) Polynomial Regression Function b) Logarithmic Model	Non-linear Regression Model
	(i) Both independant variables are binary (ii) One independant variable is binary and the other one is continuous (iii) Two independant variables are continuous	Effects on Y of a change in X, when the effect depends on the value of X2 or other x's
	(i) Both independant variables are binary (ii) One independant variable is binary and the other is continuous (iii) Two or both independant variables are continuous	Methods that allow to examine: The effect on Y of a change in X1 when this effect depends on the value of X2 or other X's
	A statistical analysis is internally valid of the statistical regression inferences about causal effects are valid	Internal validity
	A statistical analysis is externally valid of its inferences and conclusion can be generated from the population and setting being studies to pther populations and settings	External Validity
	1) The estimator of the causal effect is NOT unbiased and or consistant, e.g coeff B STR is biased and or inconsistant 2) Hyposthesis tests do not have the desired significance level and confidence intervals do not have the desired condfidence level	Threats to internal validity
	Arises when a variable that is omitted from regression determines Y and is correlated with one or more of hte included regressors Solutions: Include the omitted variable (ii) If don't observe the variable → Rely on Panel data (observational at differnt points in time) to control for unobserved omitted variables. - Use instrumental variables where these are variables that are coreelated with regressor but not with omtted variables - Design randomized controlled experiments	Biased &/or Inconsistent OLS Estimator: Omitted Variable Bias
	Type of omitted variable bias and all it takes is to include the non-linear terms. Solution: Include non-linear terms	Misspecification of functional form of the regression function
	Regressors are measured with error; (see notes) Chapter 9. Solutions: (i) Use instrumental variables which are correlated with regressor of interst but uncorrelated with error term (ii) Adjust estimates for measurement error; use error measurement error correction formula	Measurement error; Errors in variables
	Arises when we seleciton process influences the availability of data and this same process is related to the dependant variable. Solutions to sample selection: Include a selection correction term and use information on something that determines selection but is uncorrelated with outcome	Sample selection
	so far we assumed that causality goes from X to Y, but in fact causality could run the other way, i.e from Y to X. Solutions: (i) Use instrumental varaibles (ii) Design a randomized experiment	Simultaneous/Reverse Causalty
	ommited variable bias occurs when an ommited varible	1) Is correlated with an included regressor 2) Is a determinant of Y
	The coefficents in multiple regression can be estimated by OLS...	When the four least square assumptions aresatisfied, the OLS estimators are unbiased, consistent, and normally distributed in large samples
	One or more regressors can be expressed as a linear combination of the other variables/regressors - The model cannot be estimated so need to drop one of the variables - Dummy Variable Trap: Occurs when include constant ande all possible categories Which occurs when one regressor is an exact linear funciton of the other regressors, usually arises from a mistake in choosing which regressors to include in a multiple regression. solving perfect multicollinearity requires changing the set of regressors	Perfect Multicollinearity
	The standard error of the regrssion,	The R2 and R2 are measures of fit for the multiple regression model, (p.211)
	A regression model in which β1 represents the expected change in Y in response to a 1-unit increase in X1 is	Y = β0 + β1X1 + u.
	A regression model in which .01β1 represents the expected change in Y in response to a 1% increase in X1 is	Y = β0 + β1 ln(X1) + u.
	A regression model in which 100×β1 represents the expected percentage change in Y in response to a 1-unit increase in X1 is	ln(Y) = β0 + β1X1 + u.
	A regression model in which β1 represents the expected percentage change in Y in response to a 1% increase in X1 is	d. ln(Y) = β0 + β1 ln(X1) + u.
	A quadratic regression...	includes X and X2 as regressors
	a) Omitted Variable Bias b) Misspecification of functional form c) Measurement error d) Sample Selection e) Simultaneous/Reverse Causality	Biased & or Inconsistent OLA estimator
	If using homoskedasticity only standard error then get unreliable confidence interval Solution: Use heteroskedasticity robusr-standard error formula	Inconsistent OLS standard errors: Heteroskedastisicity
	If repeated observations for an entities over time e.g. time series data, or panel data then Yi+Xi are not independantly distributed and the standard error formula is incorrect Solution: Fix formula for standard error to account for serial correlation	Inconsistent OLS standard errors: Serial Correlation of error term
	Threats to external validity arise from differnces between the population and setting being studied and th epopulation and setting of interest (i) Difffernces in population Ex: Test scores i Virginia and populaiton (Type of people in sample) (ii) Differnces in settings/Institutions (regulations may be different)	Threats to external validity
	_______ allows to obtain consistent estimates when the regressor Xi is correlated with the error term Ui to understand IV regression, it is useful to think about the variation in Xi as being divided into 2 parts 1) One part, which for whatever reason, is correlated, with error term 2) Another part that is uncorrelated with Ui "Instrumental Variables" or instruments are variables that allow use to isolate the second component of the variation in Xi For instruments to work they have to satisfy two condition (i) Instrument relevance - (Instrument labeled as Zi) there has to a positive correlation between z and x (Corr (ZiXi)/=0 (ii) Instrument Exogenity - Corr (ZiUi) =0	Instrumental variables (IV) regression
	_____________ works in two stages (two stage least squares TSLS-estimation Yi=B0+B1Xi+Ui First stage: Decomomposing X1 into two parts 1) Problematic component correlated with Ui 2) Problem free component (uncorrelated with Ui) Xi = Pi`0`+Pi`1`*Zi+Vi (Problematic component) While Pi0 and Pi2 are population parameters, we can obtain OLS estiamtors of Pi0 & Pi1 and get a predictor of the problem free componant: Xi - Pi0+Pi1 Zi Second stage: Run a regression of Yi = Bo+B1X1+Ui E(B1) = B1	Instrumental variables (IV) regression
	Why is internal validity an important criterion for evaluating an econometric study?	If an econometric study is not internally valid, then it does not provide valid statistical inferences for causal effects for the population being studied.
	Why is external validity an important criterion for evaluating an econometric study?	If an econometric study is not externally valid, then its conclusions cannot be generalized to other populations.
	Key Concept 7.4, p.238 lists four least squares assumptions for the multiple regression model. Which of these assumptions is/are likely fail in the following circumstances? a. There is misspecification of the regression’s functional form. b. The regressors are measured with error. c. Data are not available when the dependent variable falls in a certain range. d. There is reverse causation flowing from Y to one of the regressors. e. The regression error terms are correlated across observations.	a. 1 (An included variable is statistrically significant) b. 1 (An included variable is statistrically significant) c. 1 (An included variable is statistrically significant) d. 1 (An included variable is statistrically significant) e. 2 (The regressors are a true cause fo the movments in the dependant variable)
	Which are especially collected when conducting experiments	Two Sources of Data: Experimental Data
	Which are obtained by observing behavior in the real world using surveys or administrative data or sources	Two Sources of Data: Observational Data
	Is te science of using economy theory combined with statistical techniques to analyze data	Econometrics (Definition)
	Data on differerent entites for single period of time	3 Types of Data: - Cross Sectional - Times Series - Panel Data Cross Sectional Data
	Data on a single entry collected over multiple periods of time	3 Types of Data: - Cross Sectional - Times Series - Panel Data Time Series Data
	Data for multiple entries in which each each entry is observed for more than two periods	3 Types of Data: - Cross Sectional - Times Series - Panel Data Panel Data
	Yi Bo+B1X1+Ui	Single Regression Framework
	Occurs when two conditions hold 1) The first condition is the ommited variable os a deteminant of the dependable variable, ie, ommited variable is part of ui 2) The ommitted variable is correlated with the included regressor	Ommited Variable Bias
	E(YilX1i = x1, X2i = x2) = B0 + B1x1 + B2x2+BkXk B0 - Expected value of Y when all x's are zero Bk - Expected change in Y from a unit change in Xk, when holding all other X's constant	Population regression line
	Yi = β0+β1X1+β2X2i+β3X1i x X2i +Ui	Interaction regression Models
	Including an additional regressor in a regression	Always reduces the sum of squared residuals
	Simultaneous causality bias	arises becauses at least one the regressors is correlated with the regression error
	If regression error terms are correlated with one another, then...	T-Statistics will not be distributed as standard normal variables in large samples

Share This Flashcard Set

Set the Language

Related Flashcards

Econometrics. Final

Add to Folders

Upgrade to Cram Premium

Card Range To Study

91 Cards in this Set