# Multicollinearity; A Statistical vs Conceptual Concept Debate

Before starting the discussion I will provide one applied example of why multicollinearity is any issue.

Consider the dependent variable is my learning of some concepts and the independent variable are help provided by two persons (two variables) in the room, now if both have experience earned from different sources then when they will speak (change in independent variable) then I will learn a lot (change in estimated dependent variable) that’s good, but if both are influenced  on each other like out of two persons one is a teacher and other is the student of each other so the second variable is a function of the first variable. so if both variables change then I will learn but I will not be able to distinguish between who is contributing in my learning. see the following illustration.

My learning = α + β concepts provided by person 1 + θ concepts provided by person 2 + μ

if person 1 is teacher of person 2 then

concepts of person 2 = f(concepts of person 1)

so if we put values in the above regression

My learning = α + β concepts provided by person 1 + θ f(concepts of person 1) + μ

so it is literally impossible to differentiate between β and θ

Now coming toward the discussion, Me and my colleague had a detailed discussion between the definition of multicollinearity.  Basically we both are agreed on the definition of it and how it creates problem the major difference is that how it is playing its role in the regression analysis.

According to me multicollinearity will only be present if it is proved statistically (i.e. VIF >10) while my colleague argued that if the two variables are interrelated theoretically then even if VIF is <10 there is conceptual multicollinearity in the model. Let us create one example

Consider a model of exchange rate named as CHEERS model can be seen in this paper.

Ex Rate = α + β1 Foreign Prices + β2 Domestic Prices + θ1 Foreign Interest rate + θ2 Domestic Interest rate + μ

while estimating this model the exchange rate is Rupees per dollar so the domestic country is Pakistan and foreign country is USA. Following are arguments.

Argument 1:

Here as per my argument if the VIF between all the independent variables is smaller than 10 then we can safely run the estimation because the multicollinearity is ignore-able (Basic Econometrics, Gujrati). even if there is conceptual multicollinearity it is not proved in the VIF test so we can ignore it.

Argument 2:

But the argument that my colleague had put forward is that as domestic prices and the domestic interest rate are connected to each other via fisher equation hence which equation is facing a problem of conceptual multicollinearity hence not correct. it cannot be ignored even if VIF has not shown it as this is a significant relationship globally accepted.

Hence I put forward this case open for discussion.

Advertisements

## One thought on “Multicollinearity; A Statistical vs Conceptual Concept Debate”

1. Kashif Ali says:

I will add to the 2nd argument , according to Stock and Watson, two or more than two variable can be added in a single model if fulfill the two conditions:

let suppose we have original model:
Yt = a + bXt + error
and we are going to add another variable Zt
to include Zt, Zt must fulfill the conditions.
first one is that Zt must be correlated with Yt. i.e.
Cor(Zt, Yt) ≠ 0

while the second one is more crucial, Zt must have some degree of association with Xt
Cor (Zt, Xt) ≠ 0

because if the second one is not fulfilled then another regression should be run as it is not relevant to the model.

so some degree of association is good. therefore multicolinearity is itself not a problem, it becomes a problem if it is sever. so as far as my understanding argument 2 is not valid.