Table of Contents

This means we don’t have information on the first price for the Ellery Street condominium, hence we remove first price from our possible independent variable list. As stated before in section 1.1, we cannot have number of days between first and last date as an independent variable either since the sale of condominium has not happened and we don’t have information on the first date the condominium was put on sale. Finally, we can intuitively see that there will be a positive correlation between interior space and number of rooms, bathrooms and bedrooms. Since interior space can be representative of all, to avoid the issue of multi-collinearity, interior space can very well act as a good proxy in our regression model for number of rooms, bathrooms and bedrooms. We will also show this through the output generated in the model description section. Further, one can also expect last price and interior space to have positive coefficients while condominium taxes, property taxes and RC to have negative coefficients. Effect of the other dummy variables for area/area codes need to be explored by running the regression model.

We will start with a basic regression model, then will check the model for normality, linearity and in case it does not pass the test we will transform the variables using