Estimating ARDL with Cointegrating Bounds in STATA

Recently I have received several comments on my previous blogs of ARDL in microfit & ARDL in eviews 9 regarding the procedure for applying the ARDL with cointegrating bounds of Pesaran in STATA. It is expected as STATA is more under practice software in the research community. Today I will show how to do ARDL in STATA.

First of all we require the ARDL module for STATA, for this write following command“findit ARDL” in STATA command window it will show the link for the ARDL module, click it and install in your STATA.

Following is the command “ardl depvar indepvar1 indepvar2 … , aic” here aic is used to automatic lag selection using Akike Information Criterion Method. Following are the results.

ardl1

I have matched the results with the ARDL of eviews, they are about 90% similar the slight difference is because the fact that both software packages use a different method to calculate standard errors. Following is the command “ardl, noctable btest” this will show the ARDL bound test and critical values. As expected the critical values are same as what is shown in the eviews but the bound test is slightly larger in eviews it is 5.43 here it is 5.62 hence we can say that there are more chances that you will find cointegration in STATA.

Now you need the long run and short run coefficients it can be estimated through “ardl

depvar indepvar1 indepvar2 … , aic ec regstore(ecreg)”

Here ec will be used to generate the error correction version of the model with aic as the criterion for the lag order. The important thing is the use of restore(name) command, it will be explained later.

ardl3

Here you can see the LR is the long run estimates, SR is the short run estimates and ADJ is the adjustment coefficient or the error correction coefficients.  Now for the case to generate the post estimation diagnostics you need to convert the ardl estimated results to the reg format so that we can apply post estimations.

For this write the command “estimates restore ecreg” it will bring the result of the ardl ecm model into the memory of the computer. And when you write the “regress” command it will show the ecm results under regress command like below

ardl4

Here you can use following commands

“estat dwatson” for the Durbin – Watson D statistics for 1st order autocorrelation.

“estat archlm” for the ARCH LM test for higher order autocorrelation

“estat bgodfrey” for the Breusch Godfrey LM test for higher order autocorrelation

“estat hettest” for Breusch Pagan Heteroscedasticity test.

“estat ovtest” for  Ramsey RESET test

“estat vif” for VIF test of Multicollinearity

For all these tests the decision criterion is available in the form of null or alternative hypothesis. Up-til now I am looking how to check the stability of the coefficient (CUSUM) test in STATA. Any one who knows how to do it please share. Hope this helps.

Update: cusum6 command can be used to generate CUSUM and CUSUMsq charts for ARDL in Stata

122 thoughts on “Estimating ARDL with Cointegrating Bounds in STATA

  1. valerie says:

    Thank you Norman for this new interesting page on ARDL. I’m just learning this model and compare with some results in Microfit. And there are also some differences in the results. I have a question to be sure I understand well the output of Stata. The “ADJ” part really corresponds to the adjustment term, so parameter of term (y_{t-1}-teta’ x_{t-1}), right? I ask this because of the name of the output (it is written y_{t-1}, eg” LP L1.” ). Thank you very much.

      1. Solomon Bizuayehu says:

        Sorry, don’t know the right place to write this comment:
        First, thanks for the brief note on ARDL. If it helps, regarding your last question in the note, we can do CUSUM test using the following procedure on STATA.
        1. if its for the first time install the utility by running the command “ssc install cusum6”
        2 . write the following command “cusum6 Y X1 X2 Year, cs(cusum) lw(lower) uw(upper)”

        Stay safe

  2. valerie says:

    I was talking about the ECM form of ARDL so
    dyt=a0 + s * (L.yt – teta L.xt) + … which is just a transformation from dyt=a0+ ADJ L.yt + d L.xt+…
    s and ADJ will be equal.

    1. Solomon Bizuayehu says:

      Sorry, I am also not clear on this point. So in your model, which parameter refers to error correction in the short run? I was running ardl model and unfortunately I did not get the result for LD. Y [laged difference of the dependent variable in the short run, it only contains independent variables] though I used the same command like you use. Can you help pls?

      THanks

      1. Noman Arshed says:

        the ECM(-1) or sometimes written as ECT(-1) is actually the L.Y variable. this is the error correction variable. secondly the LD.Y should be there. In the third picture the first variable in the SR section is the LD.Y

  3. DJ. Juste says:

    Thank you dear Norman,
    Please, is it possible to use a dummy variable in the ECM equation. For example:
    dy = a*y(-1)+b*x(-1) + a_1*dy(-1)…+a_p*dy(-p) + b_0*dx…+b_q*dx(-q) + cdum

  4. Park Sangyoup says:

    Hi, Nomad. Thank you for this useful information. I’m Park from Korea.
    After doing “ardl, noctable btest” I’ve only got from ‘ARDL regression’ to ‘Root MSE’. I couldn’t get the bounds test results with F statistics and T statistics. How can I get that? or is there a problem with my STATA?

    . ardl, noctable btest

    ARDL regression
    Model: level

    Sample: 1991m5 – 2015m10
    Number of obs = 294
    Log likelihood = -319.91337
    R-squared = .97642429
    Adj R-squared = .97576251
    Root MSE = .7296046

    This is all I’ve got.

    1. Noman Arshed says:

      Hi
      I have no idea about it, as it is a plug it may be your version of stata might not support this version kindly
      write help ardl in your command window and contact the plugin maker regarding this issue
      may be he can tell if there is any compatibility issue.
      Regards

      1. Park Sangyoup says:

        I’m using STATA13. I’ve sent an emil to the maker. Hope to get an answer soon. Thank you anyway.

      2. Park Sangyoup says:

        Hi Noman. I got the answer from the maker. he told me to put, ‘ardl, depavar indvar, lags ec’ and it went well. I can now finish my work thanks to you. Thank you very much.

  5. tram says:

    when I run ARDL in stata i see this error ( of lag permutations (768) exceeds setting of ‘maxcombs’ (500) r(9)). Is that my sample small? only 35 obs.

    1. Noman Arshed says:

      this means that you might be taking to many lags and you have too many independent variables. it can be solved by increasing the matsize in stata. type help matsize for details

  6. tram says:

    Hi Noman! I don’t know why my CUSUM is out of the critical limits of 5% but my CUSUMSQ remains within the critical limits?

  7. tram says:

    Hi Noman! Ifeel confused when explaining short-run effect. For example, export variable have 3 lag so how can i explain when 1% export increase/decrease make gdp increase/decrese beta %. Since this variable have 3 lags so have 3 beta. How can I explain? Plus 3 betas or not?

  8. Micheal says:

    Am using stata 11 but am getting message that last estimates not found. Only model summary with r-square, sample size etc are showing.

  9. tram says:

    I run model with 3 variables in stata, but in short-run i see only 2 variable and long-run is still 3 variables. I dont know why stata dont present a variable in short-run. May be this variable have problem?

  10. Lars says:

    Great article !
    I have a few questions that I would like you ask you about Cointegration in general.

    1) If I have a mix of nonstationary and stationary, say I have 2 I(0) and 2 I(1), No variables are I(2). Can i actually apply the VAR approach then? Because Ive seen several threads about this topic and all seems to give different answers all the time. (I dont want to use first Diff of the variables)

    2) Do you know any realible source of you answer about 1), I cant find any main source that you cant use VAR/VECM when you have a mix of variables of different stationarities

    3) Is there any limit in the about of variables to be used in an ARDL model? Like a benchmark.

    Thank you for an excellent blog by the way, I will def share it.

    1. Noman Arshed says:

      1) actually the inventor of VECM “Katarina” has mentioned in her book that VECM which is a special form of VAR can be used for I(0) and I(1) variables and it also illustrated the method. why people do not use mixed variables in VAR as because theoretically I(0) variable cannot be caused by I(1) variable and in VAR you will be testing it.
      2) you should search the book of “The Cointegration VAR Model” by “Katarina Juselius” it is a good reference
      3) there is no limit in ARDL model but the more the variables you use
      the less likely there will be cointegration among them, as you would require larger sample to confirm the cointegration.

      1. Lars says:

        Thanks for the answers. I have one more question related to ARDL:

        What happends if the ECM constant ADJ in stata turns out to be negative but not significant?

        1) Can i still have short-run relationship if say one variable has a significant short run coefficient

        2) same as 1) but for long-run.

  11. OBAKA Abel Inabo says:

    Please, the variables in my research are crude oil price (independent variable) and 6 dependent variables as total export crude oil revenue, inflation rate, unemployment rate, exchange rate, money supply and GDP . what is the model and command in Stata to run ARDL and then NARDL?

    1. Noman Arshed says:

      I have already provided procedure for the ARDL in the blog above. for NARDL you need to study research papers to learn how to make oil prices non linear then use those modified variables in ARDL same way as I mentioned above

  12. micheal says:

    Thanks for the good job. I have a question on the ECM coefficient. How do you interpret ECM coefficient higher than one, in absolute terms? For example, if the ECM coefficient is -1.67.

  13. Adeel Ahmad Dar says:

    dear Arshed!
    I have applied this command “ardl, noctable btest”
    the result shown by stata are

    . ardl, noctable btest

    ARDL regression
    Model: level

    Sample: 1992 – 2015
    Number of obs = 24
    Log likelihood = -65.560497
    R-squared = .98217335
    Adj R-squared = .97588159
    Root MSE = 4.4157121

    But the bound critical values are not shown by STATA. can you guide me out

    1. Noman Arshed says:

      it is applied straight after the first command, if it still not working, may be the module you downloaded it faulty. contact its authors, their email ids are provided in the “help ardl” command

  14. ben John says:

    Hi Norman,
    I get this result at the first step:

    # of lag permutations (12500) exceeds setting of ‘maxcombs’ (500)

    This is my model at the moment:

    ardl GDPpercapitacurrentUS Grosscapitalformationconstan HumanCapital MAcondvol Domesticcredittoprivatesecto Consumerpriceindex2000100, aic

    What do you suggest?

    regards,

    Ben

  15. Ben John says:

    Hi, when I do
    ardl, notable btest
    Stata doesn’t display the test statistics just the sample through to root mse,
    Any way to get those statistics?

    1. Noman Arshed says:

      it should be applied straight after the first command, if it still not working, may be the module you downloaded it faulty. contact its authors, their email ids are provided in the “help ardl” command

  16. Ben John says:

    Hi Noman,

    Sorry for all the questions, not sure if previous questions have posted but i was struggling to get the bounds test however I have been able to get it by typing
    estat btest after the ec model
    (It says it can only work if model is ECM)
    Anyway my main question is:
    How do I interpret the btest output, I’m not 100% sure which critical value I should be using to compare with?

    Thanks

  17. Ben John says:

    Hi Noman, thanks for all the help with this.
    Some more general questions, should I be concerned with heteroscedastic and non-normal residuals?
    Also is there an automatic way to use cusum6 or do I need to type it in manually?
    Also in some of the literature in my research topic, when using ARDL they use first lagged levels and not current levels, but this also includes the first lag level of the dependent variable? Should I be doing something like this too?
    (My topic is exchange rate volatility on economic growth)

    Thanks

    Ben

    1. Ben John says:

      (Or is the adjustment the lagged level of the dependent variable? writing regress it states the variable as L1.logGDP, but isn’t ECT the error from the previous period?)

    2. Noman Arshed says:

      yes you should try to remove the hetroskedasticity but non-normal residuals can be ignored of the sample is bigger than 30.
      since cusum6 is a new command so there is not automatic way, you have to do it manually.
      the specification of your model depends on the literature review, both specification have merits and demerits so judge on the bases of past studies.

  18. Maria says:

    Dear Sir,
    I’m using Stata to run an ARDL between variables that are all I(1).
    Through the application of a Gregory-Hansen Test for Cointegration with Regime Shifts, I know that there is a break in the time series I have. How do I modify the ARDL specification to include it in Stata?
    Some peers told me that on Eviews there is an option for non-linearities, but I’m not familiar with the program and don’t know where to find it.
    Thank you for your help.

    1. Noman Arshed says:

      Learn how to make the structural break dummy when the regime shift is known using any criterion. and then incorporate that dummy as exogenous independent variable in the model. but this procedure is only necessary if CUSUM graphs or RESET test is showing problem in the post regression diagnostics.

  19. Maria says:

    Dear Noman,
    Thank you for you kind answer.
    My dilemma is the following, CUSUM and CUSUMSQ behave well, and we cannot reject that the model is well specified according to the RESET test.
    However, the ghansen test for cointegration with regime shifts, tells me that there is either a break around the 2000q1 or the 2009q2 in the system.
    Thus, I wonder if I should modify the ardl from the beginning to have time dummies on these points, and redo all tests, or disregard the completely this potential issue.

    1. Noman Arshed says:

      Even though ghansen shows break, but if you choose not to use that break is not creating problem in regression as per CUSUMs and RESET. so it is up to you if you want to add this or not. This is not a potential issue, it is just extra information might improve the results but it is not harmful to not to use it.

  20. Ben John says:

    Hi Noman,

    The coefficients on the long run variables are different when I change the view,
    ie the coefficients are different when I use ardl compared to
    estimates restore ecreg
    regress,

    Is that supposed to happen? The SR coefficients are the same, and the R-sq is the same as well.

    Thanks

    Ben

    1. Noman Arshed says:

      yes they will change see in my example as, in first case the long run results and adjustment coefficient are written separately but when we go to the regress form the long run coefficients change as it is not separated from the adjustment coefficients. both are correct, for the second result there is some manipulation required but in first case the manipulation is done by STATA

      1. Ben John says:

        Hi Noman,
        Okay I see, but surely all coefficients would slightly change when they’re combined?
        I’ve been reporting the coefficients from the first regression, if correct will just keep that.
        But to clarify, the first output is the equivalent of two models? Ie a long run model and a short run ec model? Whereas the second output is one larger model?
        Thanks

  21. M. says:

    Hi,
    I was hoping you would help me make sense of some of the results I am getting.
    I am estimating a classic relationship between export volume, real exchange rates and demand. In total, I have a dependant (the exports) and 5 explanatory variables.
    All my variables are I(1), and on the short run I do get significant estimates for all of them. However, for the long run, with the ARDL I only get estimates for: LD.ln(y), L2D.ln(y), L3D.ln(y), D.ln(x1), LD.ln(x1),L2D.ln(x1),L3D.ln(x1), D.ln(x5) and LD.ln(x5).
    How would I express this in the ARDL(p,q) form, should I say ARDL (3, 0, 0, 0, 1)?
    And can the inclusion of a new variable absorb all the explanatory power in the long run of traditional indicators such as the REER?
    Thank you very much for your help.

    1. Noman Arshed says:

      first of all i think the results depend on how long your sample time period is, ideally it should be more than 50. secondly, i think your variables are too few, the studies related to your model which i have seen have used the variable of export promotion policy too as independent which is very relevant for your model. lastly there is one variable missing which would represent the cost of production of the exported products, like export price index.

  22. imran says:

    Respected sir, i want to know how exactly the equation is according to your results in the table where short run long run and adjusted coefficient are… please reply me as soon as possible….?? to

  23. M. says:

    Hello,
    I would like to get a clarification.
    I’m unsure about how to interpret the results I’m getting. Basically, I’m doing a standard export regression, of exports on import demand, domestic demand and real exchange rate.

    In one country I get that all these elements are significative in the long run, I have the adjust component (L. exports) and for the short run I only get: D. ln(imports), D. ln(domestic demand) and D. ln (reer). I was expecting to have more lagged values and lagged values of the exports as well in the short run.
    In another country, I get that something closer to what I was expecting: In the long run, imports and rear are significative, although not domestic demand. And in the short run, I have D. LD. LD2 and L3D of ln(imports) and LD, L2D L3D of ln (exports).

    I’m puzzled by the lack of lags of impact from the lags on the first regression. Do you know why that may be?
    Also, I was wondering how I would write what kind of ARDL(p,q,r) process I’m getting in the second case I just mentioned.

    Thank you for your help.
    Best
    Maria.
    P.S. I hope it gets posted this time, since I have been trying to post this question 2/3 unsuccessfully through the web

    1. Noman Arshed says:

      The lag size depends on how the particular variable have inertia, or how sluggish it is to change, so two different countries can have different lag orders. secondly you can learn about the writing equation by studying past papers, here p means lag order of dependent, and q r are the first and second independent variables, so if you have more than 2 independent, then you will add more abbreviations after r.

  24. Lasse says:

    How do I save the speed of adjustment coefficient (ECM), if I wanna rerun the regression, for instance as a robust regression (rreg command)?

  25. Alex says:

    After estimating the error correction model using ARDL, I want to compare the result from this estimation with the actual value to see how good this estimation is. But if I predict the fitted values from the regression in Stata I receive the predicted values from the first difference. Is there a way to get predicted values that are not first difference so I can compare them with the actual value to see how good this estimation is?

  26. Lars says:

    So if I get this right, the cusum6 command scould be run with dependent variable (in first differenced form) + all variables (adjustment , long-run and short-run) that is present in the ardl model that comes out with rhe regress command?

  27. Kate says:

    Thanks for your post. My ARDL model passes the CUSUM test but fails the CUSUMSQ. I try adding dummy variables but the problem get worse. Could you suggest me how should I deal with that? I am measuring the long term relation between stock market and macro variables.

    1. Noman Arshed says:

      Find the structural break in all of the independent variables, and remove the break effect from these variables by running intercept less regression of independent variable with the structural break dummy, and use the residuals which will be the portion of independent variable free of the break. This way you model will be free of all possible breaks. This might solve the problem. Remind you that this is the last possible approach, before using it try changing independent variables or use log transformation.

  28. Kate says:

    Thanks for your reply. I tried log transformation but Cusumsq test failed. I ve read your post “Construction of structural break variables”. Could you write down the equation to remove break point. Thanks.

    1. Noman Arshed says:

      Suppose you have X as independent variable and D is its structural break dummy. Now run regression X = bD + E. After estimating the regression, extract E, it will be the portion of X which is free of D.

      1. Kate says:

        Thank you very much for your time and prompt reply. I’ve learned a lot from your blog. It is not something I can find in the text book or at school. Many thanks again.
        I have one question following up. I found the following explanation on the Eviews blog. So, should I ignore the failure of Cusumsq test due to the explanation below or should I need to eliminate the break in the independent variables using the regression X = bD + E :
        “Stability in the context of the Pesaran Shin (1998) ARDL model is indeed an important subject. They make the assumption that the ARDL model being studied is in fact stable. In this regard, if you are simply looking to estimate an ARDL model to see if the estimates are valid, you should be concerned about stability. Luckily, this is easily verified by testing whether the root of the characteristic equation are outside the unit circle. In other words, does the ARDL lag polynomial produce stationary results. Nevertheless, the Pesaran, Shin, and Smith (2001) paper is a TEST for cointegration. In other words, it must allow for the possibility that the underlying cointegrating relationship may in fact NOT be stable. In this regard, the PSS(2001) paper does not a priori impose stability of the ARDL lag polynomial. However, if cointegration does indeed exist, the ARDL model will in fact be stable!”
        Thanks again,
        Kate

  29. Kate says:

    Hi, could you suggest me some literatures eliminating the structural break in independent variables in this way above please. Many thanks.

  30. Maruf ahmed says:

    If either the CUSUM or CUSUMSQUARE is found to be Unstable ( it implies then there is structural break in the data), are the estimates (long-run) invalid? If invalid, then what is the appropriate method how tho make the model stable ( in other words to make the estimates valid) if there is no longer any chance of adding new variable. (in case of testing the empirical model where variables are fixed). Please let me know with reference kindly.

    1. Noman Arshed says:

      the graphs show that model is sensitive to breaks. It means that if there are breaks in the data then the result will not be reliable. There are two ways either to add structural break of dependent variable in the model or increase lag order to absorb this change.

  31. Gabriel Palazzo says:

    Dear Noman,
    I have a silly question. If I am estimating a model with two dependent variables plus 1 lag each, plus one lag of the independent variable, do I have to look at the F critical values corresponding to k=5 or k=2?

    Thanks a lot.

  32. Pravesh Raghoo says:

    Thank you very much Noman for your help.

    I have a question. Can we use cusum6 command for regress (OLS) or it is a command strictly used for ARDL?

    Regards,
    Pravesh

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.