In my previous blog about Checking Biasness of Estimator using Monte Carlo used three cases which are following

- Intercept in population relation and intercept in sample relation
- Intercept in population relation and no intercept in sample relation
- No intercept in population relation and intercept in sample relation

The summarized results were that the tails of the slope estimator became too big, using a sample model without intercept when the population model has caused skewness to become 31.56 (≠ 0) and kurtosis to become 997.53 (≠ 3), where as these two are standard when other two forms are used. Hence it is not appropriate to remove the intercept it makes model imprecise for inference.

Now we will revisit this intercept issue by checking it against another important assumption of OLS regression. As per (Gujarati, 2003), the mean of the residuals should be zero, which means all the relevant variables are explicitly included in the model, and by our population construction the error term is randomly normally distributed with mean zero. Also the Variance of the residuals should be constant and same as the population variance, hence

Population → E(ui) = 0

Sample → E(ui |Xi) = 0

Sample variance = population variance

var (ui |Xi) = E[ui − E(ui |Xi)] x E[ui − E(ui |Xi)] = σxσ

So for these three cases the mean of residuals will be analysed to see if it is zero or not, the hypothesis is mentioned following

Mean of Error Term |
Variance of Error Term |

H0 → E(ui |Xi) = 0 H1 → E(ui |Xi) ≠ 0 |
H0 → var (ui |Xi) = σxσ = 2 H1 → var (ui |Xi) ≠ σxσ ≠ 2 |

**Case 1 – Intercept in population relation and intercept in sample relation**

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with intercept in population and sample.

Case 1 |
|||||

Statistic / Variable |
Obs | Mean | Std Error | Skewness | Kurtosis |

Slope coefficient |
100 |
1.998 |
0.484 |
-0.11 |
3.07 |

Error term mean |
100 |
-2.5E-11 |
3.80E-09 |
-0.02 |
2.87 |

Error term variance | 100 | 3.97 | 0.397 | 0.08 |
2.97 |

Test for mean of error term at population mean of 0 :

T = -2.5E-11 / 3.80E-09 = -6.6E-03

Test for variance of error term at population variance of 4:

T = 3.97 – 4 / 0.397 = -0.075

These results shows that when the population and sample both have intercepts in them, the slope coefficients are not biased, also the mean and variance of the error term are same as population mean. Hence none of the assumption violated.

**Case 2 – Intercept in population relation and no intercept in sample relation**

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with intercept in population and no intercept sample.

Case 2 | |||||

Statistic / Variable | Obs | Mean | Std Error | Skewness | Kurtosis |

Slope coefficient | 100 | 4.027 | 15.571 | 19.14 | 496.13 |

Error term | 100 | -0.144 | 7.00E-03 | 0.19 | 3.08 |

Error term variance | 100 | 4.17 | 0.397 | 0.2 | 3.04 |

Test for mean of error term at population mean of 0 :

T = -0.144 / 7.0E-03 = -20.6

Test for variance of error term at population variance of 4:

T = 4.17 – 4 / 0.397 = 0.42

Here we can see that the slope is already imprecise, also the mean of the error term is not statistically equal to zero, hence violation of one assumption of OLS.

**Case 3 – No intercept in population relation and intercept in sample relation**

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with no intercept in population and intercept in sample.

Case 3 | |||||

Statistic / Variable | Obs | Mean | Std Error | Skewness | Kurtosis |

Slope coefficient | 100 | 1.977 | 0.471 | -0.07 | 2.95 |

Error term | 100 | 1.30E-11 | 2.03E-09 | -0.13 | 3.4 |

Error term variance | 100 | 3.99 | 0.409 | 0.21 | 3.23 |

Test for mean of error term at population mean of 0 :

T = 1.30E-11 / 2.03E-09 = 0.006

Test for variance of error term at population variance of 4:

T = 3.99 – 4 / 0.409 = -0.024

Surprisingly, even if the population does not have intercept in it, the sample OLS with intercept perform as well the case 1. With unbiased slope coefficient, zero mean and constant variance of Error term.

**Conclusion:**

After revisiting the intercept issue with another angle, we found strong evidence that sample model should have intercept in it. Even thought the population had intercept of 1 (a very small value) but if we do not include it in the sample, the results deviate too much.

Stata Do file : residual analysis of intercept

**Reference:**

Gujarati, D. N. (2003). *Basic Econometrics* (4th ed.)

## One thought on “Biasness of Estimator – Revisited – Residual Analysis”