• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/21

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

21 Cards in this Set

  • Front
  • Back

Four types of variables in empirical analysis

1. Ratio scale


2. Interval scale


3. Ordinal scale


4. Nominal scale

Nominal scale variables

Also known as indicator variables, categorical variables, qualitative variables or dummy variables.

The nature of dummy variables

-> Y is influenced not only by ratio scale variables (income, output, prices, costs, height, temperature) but also by variables that are essentially qualitative or nominal scale in nature, such as sex, religion, nationality, geographical region, political upheavals and party affiliation.


->Qualitative nature

Qualitative

-> Such variables usually indicate the presence or absence of a "quality" or an attribute, such as male or female, black or white.


-> We could quantify such attributes by constructing artificial variables that take on values of 1 or 0.

Caution in the Use of Dummy Variables (1)

1. Danger: perfect collinearity: if a qualitative variable has m categories, introduce only (m-1) dummy variables.


-> Since the value of the intercept a is (implicitly) 1 for each observation, you will have a column that also contains 51 ones. In other words, the sum of the three D columns will simply reproduce the intercept column, thus leading to perfect collinearity.

Caution in the Use of Dummy Variables (2)

2.The category for which no dummy variable is assigned is known as the base, benchmark, control, comparison, reference or omitted category. All comparisons are made in relation to the benchmark category.

Caution in the Use of Dummy Variables(3)

3. The intercept value (b1) represents the mean value of the benchmark category. In example 9.1 the benchmark category is the Western region. Hence, in the regression the intercept value of about 48,015 represents the mean salary of teacher in the Western states.

Caution in the Use of Dummy Variables (4)

The coefficients attached to the dummy variables are known as the differential intercept coefficients: they tell by how much the value of the category that receives the value of 1 differs from the intercept coefficient of the benchmark category. (p.281-282 interpretation)

Caution in the Use of Dummy Variables (5)

5. About the dummy variable trap: there is a way to circumvent this trap by introducing as many dummy variables as the number of categories of that variable: Provided we do not introduce the intercept in such a model.

Which is a better method of introducing a dummy variable? omit the intercept term and introduce a dummy for each category or include the intercept term and introduce only (m-1) dummies

Most researchers find the equation with an intercept more convenient because it allows them to address more easily the questions in which they usually have the most interest.

What is ANOVA model

Analysis of variance is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups developed by statistician and evolutionary biologist Ronald Fisher.

ANCOVA Models

The regression with a mixture of quantitative and qualitative regressors. (economics don't use ANOVA): analysis of covariance models.


-> They are an extension of the ANOVA models in that they provide a method of statistically controlling the effects of quantitative regressors, called covariates or control variables, in a model that includes both quantitative and qualitative, or dummy, regressors.

Deseasonalizing time series

Using dummy variables. (to avoid the dummy variable trap, we are assigning a dummy to each quarter of the year, but omitting the intercept term)

Piecewise linear regression

Consists of two linear pieces of segments. Given the data on commission, sales, and the value of the threshold level X*, the technique of dummy variables can be used to estimate the (differing) slopes of the two segments of the piece wise linear regression.

Semilogarithmic Regressions

Log models: in such a model, the slope coefficients of the regressors give the semielasticity, that is, the percentage change in the regressand for a unit change in the regressor. This is only so if the regressor is quantitative.

What happens if a regressor is a dummy variable?

"The intercept b1 gives the mean log hourly earning and the "slope" coefficient gives the difference in the mean log hourly earnings of male and females."


-> But if we take the antilog of b1, what we obtain is not the mean hourly wages of male workers, but their median wages.


-> Mean, median and mode are the three measures of central tendency of a random variable.

What happens if the dependent variable is a dummy variable?

-> So far we have considered models in which the regressand is quantitative and the regressors are quantitative/qualitative.


-> But there are occasions where the regressand can also be qualitative or dummy.


-> The decision of a worker to participate in the labor force -> yes or no (but depends on several factors, such as the starting wage rate, education, and conditions in the labor market - as measured by the unemployment rate)

Summary and conclusions (1)

Dummy variables, taking values of 1 and 0 (or their linear transforms), are a means of introducing qualitative regressors in regression models.



Summary and conclusions (2)

Dummy variables are a data-classifying device in that they divide a sample into various subgroups based on qualities or attributes (gender, marital status, race, religion etc) and implicitly allow one to run individual regressions for each subgroup. If there are differences in the response of the regressand to the variation in the qualitative variables in the various subgroups, they will be reflected in the differences in the intercepts or slope coefficients, or both, of the various subgroup regressions.

Summary and conclusions (3)

Although a versatile tool, the dummy variable technique needs to be handled carefully.


1. if the regression contains a constant term, the number of dummy variables must be one less than the number of classifications of each qualitative variable.


2. The coefficient attached to the dummy variables must always be interpreted in relation to the base, or reference, group-that is, the group that receives the value of zero. The base chosen will depend on the purpose of research at hand.


3. If a model has several qualitative variables with several classes, introduction of dummy variables can consume a large number of degrees of freedom. Therefore, one should always weigh the number of dummy variables to be introduced against the total number of observations available for analysis.

Dummy variables (applications)

1. Comparing two (or more) regressions


2. deseasonalizing time series data


3. interactive dummies


4. interpretation of dummies in semilog models


5. piecewise linear regression models.