A manual for ARDL approach to cointegration

ARDL model was introduced by Pesaran et al. (2001) in order to incorporate I(0) and I(1) variables in same estimation so if your variables are stationary I(0) then OLS is appropriate and if all are non stationary I(1) then it is advisable to do VECM (Johanson Approach) as it is much simple model.

We cannot estimate conventional OLS on the variables if any one of them or all of them are (1) because these variable will not behave like constants which is required in OLS and as most of then are changing in time so OLS will mistakenly show high t values and significant results but in reality it would be inflated because of common time component, in econometric it is called spurious results where R square of the model becomes higher than the Durban Watson Statistic. So we move to a new set of models which can work on I(1) variables.

How to Estimate ARDL model

In order to run ARDL some preconditions needed to be checked

  • Dependent must be non stationary in order for the model to behave better.
  • None of the variable should be I(2) in normal conditions (ADF test)
  • none of the variable should be I(2) in structural break (Zivot Andrews test)

Step 1 Check Optimal Lag order

First we need to check the lag order to see what lag we use to the ADF test for each variable which is being used in the model.  This is done using VARSOC Table (Vector Auto Regressive Specification Order Criterion) which is available in STATA that can be quickly applied, in EVIEWS you have to do it after VAR model and check the Lag length criterion, you can learn that from this blog post by Dave Giles.

To learn how to import and handle data in STATA visit

http://econistics.wix.com/home#!Chapter-1-Importing-data-in-STATA/cgor/90096572-0B0D-4832-9AFD-51C18A0017F0

varsoc

This test will provide 6 methods (LL, LR(p), FPE, AIC, HQIC, SBIC) and there will be a star on them where that criteria had optimal lag, so we need to select the majority so here the lag order is 1 for this variable. Similarly you have to find optimal lags for all variables. The code of this in STATA is “varsoc variable-name”

Check the stationarity of each variable

Now we need to confirm if none of the variable is I(2) for this we need to do the ADF Test (Augmented Dickey Fuller)  and see the Z(t) statistic on the top if the first test statistic is smaller than all others in magnitude if they have same sign then it means that variable is I(1) when we are checking at level. similarly you have to prove it I(0) at first difference.

dfuller 1

See in the table of first difference now the test statistic is larger than at least one of the critical value so it is I(0) at first difference.

dfuller 2

Next is to see if the variable is sensitive to structural break means to say if we check unit root (Zivot Andrews Test) in the test that allows structural break then if the variable shows to be I(2) then there is a problem that needed to be addressed like we might have to add a structural break variable in the model or we have to remove the break from the variable. It is checked and interpreted same as the ADF Test.

zandrews

Now once you proved that all variables are not I(2) you can proceed to ARDL model. This document will help you to do ARDL estimation in Microfit. Firstly we have to import data in Microfit. It can be done by  first coping all the data with variable names without spaces and the years. Then open Microfit and go in file and select copy data from the clipboard. it will show you the data press OK there.

It will ask if you have years on the left most column if you have copied the years to the select it, then if you have selected the names of the variables while coping then select here too and as there was no description of the variables in the copied file then select no description and press OK.

you can get Microfit 5 from here

https://www.dropbox.com/s/r0viysvisecldho/Setup_Demo%20microfit%205.rar?dl=0

you will see your variable names in the center in red like in the picture below

microfit start

now you need to add the intercept and the trend term which might be used in the estimation for intercept press the constant term below at right it will ask what name you want to give to it, I usually change it to “c” so that it is small as possible. For trend press the time trend button and it will ask to name it , i usually keep it as “t” only so that it is small too. After this your data is ready for estimation of ARDL; go in uni-variate on the top and press ARDL approach to cointegration; it will show you the estimation page like image below

microfit ardl

here you have to mention the names of your variables (same spelling as you had in the excel file without spaced it is good to keep them small as possible so that it is easy to remember and retype) in appropriate order dependent first and all others after it.  And at the end enter “&” and enter c for constant and t for trend (if needed) this “&” sign will differentiate between variables and constants (exogenous) in Microfit. When you entered variable names press run on the right top with here the lag order of ARDL is 1 so try to make the model at 1 lag order first if not then we will see lag order 2.

When u press run it will take some times at it is a DEMO version to press OK for few times if there is any error then it will show you one menu (let’s call it menu 1 for selecting lag order criterion) like this below

menu1

here you have to mention the method you want to use let it be Schwarz Bayesian criterion as standard and press ok (you can specify you own ARDL order at 6 if any one of the above criterion are not giving appropriate diagnostics, you can make you own from you unit root results as you know all I(0) are not related to past so their lag order must be zero and others can be one. and second method is to see in the ARDL results whose level is significant so we do not need its lag so actually it is hit and trial method if you have studied time series you might have intuition behind how to find the appropriate lag if any of the criterion are not able to provide suitable results)

After a little time it will show you another menu  (let it call menu to for ARDL model selection) like this

menu2

here you first have to see the ADRL regression for fitness so press OK by selecting option 1. it will  show you results like this

ardlmodel

In this table you have to note the highlighted things first and for most you have to see if the F-statistic (of 5.85) is higher than the upper bound at 95% (of 4.82) or 90%  (of 3.87) any one will work. If it comes higher, then you can say that there is cointegration among the set of (I(0) & I(1)) variables so we can assume that there can be at least long run or short run relation among these variables. If this F statistic is not higher than any of the upper bound critical values for first try to modify the lag order so that F statistic might go above the critical value of upper bound if it is not possible see the guide in following passage.

If the F statistic is higher than the lower bound but lower than the upper bound then you must check all of your I(1) variables if there is any useless (to remove) or you have missed any one (to add) . if F statistic is lower than the lower bound then all of your variables  I(0) and I(1) are not appropriate means to they are not cointegrated try to change the model add more variables modify their specification if possible other wise you can report that these variables do not relate with each other their relationship is spurious.

So if you found the F statistic larger than the upper bound like in the image above then you have to see the diagnostics there are four provided at the end of the table (which are auto-correlation, normality, specification and hetroskedasticity test) first is serial correlation which is insignificant as per F version but significant at 5% in LM version so we can assume that there is no auto-correlation at 1% or according to F version. Similarly the functional form is insignificant (no issue); normality is insignificant (no issue) and hetroskedasticity is insignificant (no issue) too hence there is no apparent issue which with this model. So we can proceed to next step.

If you find any of the issue try to change the lag order increase sample or variable specification in order to correct them.  if your sample is higher than the 30 then you can ignore the normality issue if it exists as per central limit theorem. There is another information on the top of the table which is the lag order of the estimation which is to be reported in the regression analysis. Press close to proceed it will show you a new  menu (call it menu 3 of post regression) like this

menu3

Here select the second option of move to hypothesis testing menu as we need to see if there is any recursive residuals because of structural break as ARDL is sensitive to it. When you press OK you will see another menu (let it be menu 4 of hypothesis testing) like this

menu4

Here you need to use the CUSUM and CUSUMSQ charts which is at option 4 select it and press OK  it will show you two charts one by one by pressing CLOSE button copy them in your estimation document, they look like following . The thing to note here is that the blue line must not cross red and the green one for any of both charts. If it does not cross then it means that there is not issue of recursive residuals in terms of mean (in first CUSUM chart) and in terms of variance (in second CUSUMSQ chart) so you can proceed . If you find issue here then you must have added some variable which is sensitive to structural break, it might be solved if you add the trend otherwise you have to find the structural break of the dependent variable and make dummy from it and introduce it as independent variable in order to balance the residuals.

cusumcusumsq

When you close both charts it will show you the 4th menu again you have to go back so select 0 option, now it will show you the 3rd menu you have to go back again so press 0 option and OK, now here you want to see the long run results are the model as it has passed the diagnostics (F statistic, auto-correlation, hetroskedasticity, specification, normality, CUSUM and CUSUMSQ).

It will show you results like following. you have to note the highlighted things as they are the long run estimates so use the coefficient and its t value and probability to interpret the model in long run. Here most of them are insignificant it does not means that model is bad they are cointegrating as we have seen from the f statistic in the first table so they might be effecting each other in short run if they are not in long run.

 

longrun

 

These results can be written as where the green one is significant

LGDP = – 0.70  + 0.64 LFDI – 0.003 TRA + 0.22 INF – 0.04 CRED + 0.002 MC

t values   -0.23      2.83               -0.71                  1.68             -1.32              0.86

prob         0.81       0.01                0.49                  0.11              0.20               0.41

Press close you are on the 3rd menu again go back to 2nd menu by pressing 0 option and OK. Now you want to see the error correction model to see the short run results, it will show you results like following

shortun

Here you have to note the following see the estimates on the top of the table they are short run components. here ecm(-1) is most important it should at least be negative and significant also if it is between 0 and -1 then it will be ideal (this conditions will ensure that there is convergence in the model which indirectly means that there is a significant long run relation)

So here it is significant at 10% level and between 0 and -1 , it is -0.11 the more it is near to -1 stronger the equilibrium is but its significance is must. So we have proven equilibrium though it is weak or slow. rest of the variables are showing the short run component, the significant variables will show that they have significant effect on the dependent variable in short run.  if some variable have both short run and long run components significant they we can say that the particular variable has strong causal effect on the dependent other wise of it is short run only then it is weak causal effect. It is reported as

ΔLGDP =  0.07 ΔLFDI + 0.0002 ΔTRA + 0.03 ΔINF0.01 ΔCRED + 0.0003 ΔMC – 0.12 ECM(-1)

t values     2.66                   0.25                      2.82                  -4.24               0.81                    -1.78

prob           0.02                  0.80                      0.01                  0.00                 0.43                    0.09

Other things to note is the r square and the F test. R square is for interpretation  like OLS and F test to see overall fitness of the model if the model is too weak then it will become insignificant, here another thing is the residual sum of squares which can be use to compare it with some other ARDL model with same dependent variable if we want to see performance of two models then we compare this. Here you cannot interpret the Durban Watson as there are lags in the model so no need to worry about it as the serial auto test has cleared the presence of auto in the first table. In this model of short run FDI, Inf, Cred are significant. So this is your model of ARDL there are one more step that is usually done in order to see if the ARDL model is consistent or not. See the above example it is GDP = f(FDI, TRA, INF, CRED, MC) so the F test in the first table led to clarify that this model is true and valid but it does not tell any thing about the reverse models like.

FDI = f(GDP, TRA, INF, CRED, MC)

TRA = f(FDI, GDP, INF, CRED, MC)

INF = f(FDI, TRA, GDP, CRED, MC)

CRED = f(FDI, TRA, INF, GDP, MC)

MC = f(FDI, TRA, INF, CRED, GDP)

So if any one of these models are also true and valid then this ARDL results we have found will become inconsistent, means to say that this approach is single equation model but there are more than two equations that needed to be estimated which require simultaneous equation model. That is why it was advised at the start that if all variables are  I(1) then Johanson Approach ECM can be a use-full method. So we check the consistency of the model by running these 5 mentioned models in Microfit and see their F – Test values in the first table just like we did in the above example and hope and for all these 5 models the F test values are lower than the lower bound values.

If you want to see its details see the following links

Detail ARDL model by Dave Giles

http://davegiles.blogspot.in/2013/06/ardl-models-part-ii-bounds-tests.html

Learn it from video

http://econistics.com/2015/09/chapter-5-time-series-domain-ardl-co-integrating-bounds-using-microfit-and-eviews-2/

This attempt to explain ARDL is preliminary subject to improve based on the comments and suggestions below. If you think that there is a room for improvement let me know because we can change the description above any time unlike a book which cannot me modified after being published. In order to polish your skill of ARDL it is good to see articles on it after reading the blog and see how people have presented it and explained it.

 

Advertisements

307 thoughts on “A manual for ARDL approach to cointegration

    1. Nadya says:

      Sorry if this kind of question quite out of discussion topic above. Im the one that wanna to know about Structure Equation Model (SEM). If you know about this method, i beg for your explanation. Thanks ahead.
      Nadya

  1. Vishal says:

    Is it possible to estimate time varying regression coefficient using ARDL model?
    if yes then tell me procedure
    if not the tell me some other model to estimate time varying regression coefficient

      1. Noman Arshed says:

        just use the trend variable and multiply it with all independent variables to make new variables use them as independent variables in regression of ARDL, then while interpreting you have to make sure that you are considering both independent variable and trend in same variable. just like when we interpret the cross-product variable.

      2. Vishal says:

        Hey, I need some clarification. Actually i want to use lag value of my dependent variable as my independent variable but my dependent variable is I(0) and my other independent variable is also I(0) then is it good to use ARDL model?

      3. Noman Arshed says:

        If all your variables are I(0), there is no need to use ARDL, because it is expected that all short run coefficients will be insignificant. First, try using OLS add lag of dependent if theory suggests, and properly do the diagnostic testing.

      4. Vishal says:

        I had used standard OLS model but there is problem of auto-correlation in model due to which my variables are inefficient.

  2. Miftahu Idris says:

    Dear Sir.
    Thank you for sharing your vast knowledge and experience, we appreciate.
    I run an ARDL model of six (6) variables but my diagnostic result indicates that CUSUM test is significant but CUSUMSQ is not. Please, what may likely be the problem and how should I solve it.
    Thanks, in anticipation of your kind assistance.

  3. Miftahu Idris says:

    Dear Sir.
    Thank you for your academic guidance. I have a problem with my model which I am using for my Master’s thesis, and I hope you can assist me. I run an ARDL model of 6 variables, my results of ECT(-1) shows a positive coefficient of 0.72 with an insignificant probability value of 0.64. I have encountered many challenges in my estimation. Once I solve one problem, another estimation problem will appear (crying). The more I read literature on how to solve this problem, the more I get confused. Please, Sir, what may likely be causing this, and how to solve it? I can send you my work file for further assistance.
    As you continue to help others, May Allahu (SWT) continue to rewards you with the best of his reward, Ameen.
    Looking forward to seeing your reply.
    Thanks, in anticipation for your kind response and assistance.

    Yours Sincerely,
    Miftahu Idris.

    1. Noman Arshed says:

      ECT value comes out wrong mostly because of the wrong combination of independent variables. Try to see literature and find 2-3 more variables and find best combination from the available independent variables.

  4. youssef says:

    I fully appreciate your response dear Dr. Noman…

    sorrry Dr….
    can I delete trend variable if I found it insignificant

  5. saif arian says:

    if ECT value is between 0 to -1 and statistically significant, what it means . and further in the presence of controlled variables the Value of ECT is between 0 to -1 but long run coefficients of primary explanatory variables are positive but insignificant then how we will interpret the model

    1. Noman Arshed says:

      ECT between 0 and -1 tells that theory is converging, and if long run variables are insignificant it will tell that they are individually not contributing change in the dependent variable

  6. saif arian says:

    if Value of ECT is between 0 to -1 but long run coefficient of primary explanatory variable in the presence of controlled variables is positive but insignificant then how we will interpret the model

  7. Ole Martin Eidjord says:

    Dear Arshed

    Thank you so much for sharing this very helpful information about ARDL models. I have a question

    I`m trying to make an ARDL model and have 6 varaibles were the dependent variable are I(0), stationary at level, with 6 lags and the 5 remaining independent variables are I(1), stationary at first differances, with 1 lag. Under “How to estimate ARDL model” your first requirements says that the dependent variable has to be I(1) in order for the model to work better. Does this mean that I can not use an ARDL model since my dependent variable are I(0), stationary at level?

    Thanks in advance

    1. Noman Arshed says:

      ARDL might work when the dependent is I(0) but even if it is I(1) we do no usually use lags of dependent more than 1 or 2. you have used 6 this means, independent variables are not closely related to dependent.

  8. Chan says:

    Dear Dr.

    Thanks for your post and it’s really helpful.
    May I ask if unit root test found one of my dependent variable and independent variables is I(1) while other is I(0). During ARDL test, shall I use first difference on the independent variable and the I(1) variable?

    E.g Y= a+ b1x1+b2x2+b3x3+c
    Y and X1 is I(1), x2,x3 is I(0)
    During ARDL, shall we first differentiate on Y and X1 before put into software for estimation? That is transformed to below format:
    d(Y)= a+d(X1)+X2+X3+c for estimation?

    Thanks

  9. youssef says:

    Dear Arshed….
    Thank you so much for useful information about ARDL models
    sorry …
    when I try to estimate long-run relationship should add intercept and the trend term to the model or can I remove the trend term in order to get cointegration.

    1. youssef says:

      sorry, Dr …you have mentioned previously entered the trend term if you needed. with a note, I have used time series data

  10. youssef says:

    thank you so much for replying, really I appreciate your useful information.
    sorry Dr…my last question .. it is possible to use different indicators of variables when I try to estimate long-run relationship among the variables .. for example, some of them as a percentage and some of them as a value or should unifying all of them.

  11. masitah says:

    Hi nomad i’m using microfit 5 demo version but i cannot run data for ecm and cusum? the software said (too many variable). How to solve this problem or demo version have limitation?Thanks

  12. benhilda gwacha says:

    Hie…..i was considering a var model with 4 variables….however 3 of them are I(1) and one is I(0)….also the I(1) are integrated so which model should i now use……please help me

    Thank you

  13. Minhaz Azad says:

    Firstly, Do I need to test structural breaks for all variables separately or whole model( Consequently whether we need to use dummy variable for each variable or just one for the whole model? Do we need to include dummy variable for co-integration.
    Secondly, Could you please explain how to apply unrestricted VAR model for testing causality with in ARDL framework( many people answer that VAR different from ARDL, then my question many papers have done that but can not find the clue, for example . I dont understand how they figure table 3.
    answer with STATA help will be great help.

    1. Noman Arshed says:

      structural break testing in the dependent variable only and use the dummy for that break as the independent variable. VAR is different from ARDL. see this blog and if you want to learn var see following link

  14. woo says:

    hello dear Dr. Noman, Can I apply cointergration bound test not only usual ARDl modle also in ‘Panel’ ARDL model ?
    say, b1=b2=..=0 type test

  15. Julius Mark says:

    Hello dear Dr. Thanks for sharing your knowledge very helpful.
    I had a problem with my research, i’m using six variables,
    1. The dependent is stationary at first difference,
    2. The three independent variables are stationary at second difference,
    3. One independent is stationary at first difference,
    4. also one independent is not stationary at all level.

    Is it possible to run ARDL model or what could be correct model for this.

    Thanks

    1. Noman Arshed says:

      It is very rare for variables to be stationary at second difference, use at least two types of unit root tests for consensus. If it is same then remove the variables which are stationary at second difference, other will work in ARDL.

      1. Desiarjay says:

        If my dipendent is I(2), can I use the first difference? More in general, if i have 4 variables, 2 I(2), one I(1) and one I(0), does Ardl work?
        Thank you

      2. Noman Arshed says:

        Firstly I(2) variables are very rare, do more than one type of unit root to confirm that variable is I(2). convert all I(2) variables into first difference forms then will become I(1) forms, then they are useable.

  16. Abid Khan says:

    it the F-Statistic value in Wald test is less than the lower bound value then according to your discussion we have to report that the relation among these variables is spurious. but my question is, how we can interpret the coefficient of GDP mean that if the Coefficient of GDP increases it will decrease/ increase how much in non-performing loans?

    Regards;
    Abid Khan

  17. Hager says:

    Is there any link between (short run and long run causality) and ARDL results (short and long run estimation coefficients)

  18. Miftahu Idris says:

    Good Day Sir.
    I estimated an ARDL model of 6 variables. After several attempts (using different lags ) to find a better estimate, i got a selected ARDL model using AIC as (1,1,0,0,1,2) while using SIC is ARDL (1,0,0,0,1,2).
    Can I still use this model given these lags selection?
    ARDL (1,1,0,0,1,2) = AIC
    ARDL (1,0,0,0,1,2,) = SIC
    Thank you and best regards.

  19. Ahmad says:

    Respected Sir,
    I am applying ARDL in Eviews 9 while applying ARDL co-integrating long run form the probability of variables are not significant and the signs of some variables are also opposite to the existing theories but the co-integrating term is significant and negative. I also changed the lags for both dependent and regressors but the problem remain the same. My data is for 32 years. Kindly suggest solution to handle this problem.
    Best Regards.

  20. Béatrice says:

    Dear Norman,
    Thank you for this explanation!!!
    Is the dummy variable stationary?
    Thank you in advance for your answer

  21. Béatrice says:

    thank’s for your answer, it is very important to know.
    Concerning the causality (short term and long term), how to apply this test in an ARDL approach?
    Is it possible to apply the Granger non causality test for the short term in an ARDL approach?
    Or, the Wald test gives results for the short term and the long term? If yes, how to distinguish these results?
    Best regards.

      1. Julius Mark says:

        Thanks Beatrice for your question to Noman, I had that problem of Granger causality test after running ARDL. How do we go about, i am using Eviews

      2. Noman Arshed says:

        Actually, F bound test in PSS ARDL model itself describes causality when variables are mixed order. Granger causality is the approach usually used when variables are I(1). but still, if you want to do Granger causality, then apply Wald test on short run coefficients for short run causality and on long run coefficients for long run causality.

      3. Julius says:

        Thanks Sir. for your explanation, unfortunately i used Wald test but panelists they challenge me and say it is not granger causality

      4. Noman Arshed says:

        you should have a look what are the preconditions of Granger causality, I think it is used if all variables are I(1) but if you have used ARDL because of I(0) and I(1) mixed variables, you cannot use Granger causality

  22. Béatrice says:

    Julius: fr short term, i use Granger’s non-causality test in VAR approach, for the long term, i use the Wald test (not the block test) but rather the Wald test in the ARDL approach which tests the null hyopthesis according to which The regressors have null coefficients. If the student statistic is significant, this means that the explanatory variables cause the dependent variable (only one orientation).
    Excuse me for the mistakes of spelling, I do not master English.

    1. Julius Mark says:

      Thank you Beatrice and Noman for your contributions, from there and reading more i think i can do better

  23. Emilia says:

    Dear Sir,
    I am thankful for this ARDL learning post. I would like to ask about multicollinearity.

    When I built a model, I get the next message from Microfit: “Multicollinear regressors and their lagged value!”
    What can I do in this case to go on? How to handle it in Microfit?

    Thanks a lot.
    Regards,
    Emilia

      1. Emilia says:

        Dear Dr, Thank you for answer.
        I would have two questions, please help me clarify. First, according to the applying the varsoc code for my variables in Stata, the appropriate lag order is from 0 to 4. But when the Microfit calculates in the first step of ARDL test (listening to F stat.), it ranges only from 0 to 1. This difference is problem?
        Many thanks.

      2. Emilia says:

        So, current results with 0-1 lag in ARDL is inappropriate? What program should I believe? 🙂

  24. youssef says:

    Hello, dear Dr. Thanks for sharing your knowledge very helpful.
    I am using seven variables of time series data of 34 observations, one of these variables its observation for first four years began in zero numbers.
    so when I run the macrofit 4.1 the numbers of observations used for estimation decreased to 26.

    this is normal or if I must solve this problem what can I do.

    really I need your assistant

    Thanks, in anticipation for your kind response and assistance.

    Yours Sincerely,

  25. LISSOM says:

    Hello! On entering variables name in microfit dialog box before running ARDL model and on clicking RUN button, it is showing error msg, 1 IS NOT A VALID DATE. what does this error msg mean and how to rectify it? thank you

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s