# Biasness of Estimator – Revisited – Residual Analysis

In my previous blog about Checking Biasness of Estimator using Monte Carlo used three cases which are following

• Intercept in population relation and intercept in sample relation
• Intercept in population relation and no intercept in sample relation
• No intercept in population relation and intercept in sample relation

The summarized results were that the tails of the slope estimator became too big, using a sample model without intercept when the population model has caused skewness to become 31.56 (≠ 0) and kurtosis to become 997.53 (≠ 3), where as these two are standard when other two forms are used. Hence it is not appropriate to remove the intercept it makes model imprecise for inference.

Now we will revisit this intercept issue by checking it against another important assumption of OLS regression. As per (Gujarati, 2003), the mean of the residuals should be zero, which means all the relevant variables are explicitly included in the model, and by our population construction the error term is randomly normally distributed with mean zero. Also the Variance of the residuals should be constant  and same as the population variance,  hence

Population → E(ui) = 0

Sample → E(ui |Xi) = 0

Sample variance = population variance

var (ui |Xi) = E[ui − E(ui |Xi)] x E[ui − E(ui |Xi)]   = σxσ

So for these three cases the mean of residuals will be analysed to see if it is zero or not, the hypothesis is mentioned following

 Mean of Error Term Variance of Error Term H0 →  E(ui |Xi) = 0 H1 →  E(ui |Xi) ≠ 0 H0 → var (ui |Xi) = σxσ = 2 H1 → var (ui |Xi) ≠ σxσ ≠ 2

Case 1 – Intercept in population relation and intercept in sample relation

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with intercept in population and sample.

 Case 1 Statistic / Variable Obs Mean Std Error Skewness Kurtosis Slope coefficient 100 1.998 0.484 -0.11 3.07 Error term mean 100 -2.5E-11 3.80E-09 -0.02 2.87 Error term variance 100 3.97 0.397 0.08 2.97

Test for mean of error term at population mean of 0 :

T = -2.5E-11 / 3.80E-09 = -6.6E-03

Test for variance of error term at population variance of 4:

T = 3.97 – 4 / 0.397 = -0.075

These results shows that when the population and sample both have intercepts in them, the slope coefficients are not biased, also the mean and variance of the error term are same as population mean. Hence none of the assumption violated.

Case 2 – Intercept in population relation and no intercept in sample relation

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with intercept in population and no intercept sample.

 Case 2 Statistic / Variable Obs Mean Std Error Skewness Kurtosis Slope coefficient 100 4.027 15.571 19.14 496.13 Error term 100 -0.144 7.00E-03 0.19 3.08 Error term variance 100 4.17 0.397 0.2 3.04

Test for mean of error term at population mean of 0 :

T = -0.144 / 7.0E-03 = -20.6

Test for variance of error term at population variance of 4:

T = 4.17 – 4 / 0.397 = 0.42

Here we can see that the slope is already imprecise, also the mean of the error term is not statistically equal to zero, hence violation of one assumption of OLS.

Case 3 – No intercept in population relation and intercept in sample relation

Since the population mean and variance of the Error term is 0 and 4 respectively, following are the results of OLS simulation with no intercept in population and intercept in sample.

 Case 3 Statistic / Variable Obs Mean Std Error Skewness Kurtosis Slope coefficient 100 1.977 0.471 -0.07 2.95 Error term 100 1.30E-11 2.03E-09 -0.13 3.4 Error term variance 100 3.99 0.409 0.21 3.23

Test for mean of error term at population mean of 0 :

T = 1.30E-11 / 2.03E-09 = 0.006

Test for variance of error term at population variance of 4:

T = 3.99 – 4 / 0.409 = -0.024

Surprisingly, even if the population does not have intercept in it, the sample OLS with intercept perform as well the case 1. With unbiased slope coefficient, zero mean and constant variance of Error term.

Conclusion:

After revisiting the intercept issue with another angle, we found strong evidence that sample model should have intercept in it. Even thought the population  had intercept of 1 (a very small value) but if we do not include it in the sample, the results deviate too much.

Stata Do file : residual analysis of intercept

Reference:

Gujarati, D. N. (2003). Basic Econometrics (4th ed.)