Recently I have received several comments on my previous blogs of ARDL in microfit & ARDL in eviews 9 regarding the procedure for applying the ARDL with cointegrating bounds of Pesaran in STATA. It is expected as STATA is more under practice software in the research community. Today I will show how to do ARDL in STATA.

First of all we require the ARDL module for STATA, for this write following command“findit ARDL” in STATA command window it will show the link for the ARDL module, click it and install in your STATA.

Following is the command “ardl depvar indepvar1 indepvar2 … , aic” here aic is used to automatic lag selection using Akike Information Criterion Method. Following are the results.

I have matched the results with the ARDL of eviews, they are about 90% similar the slight difference is because the fact that both software packages use a different method to calculate standard errors. Following is the command “ardl, noctable btest” this will show the ARDL bound test and critical values. As expected the critical values are same as what is shown in the eviews but the bound test is slightly larger in eviews it is 5.43 here it is 5.62 hence we can say that there are more chances that you will find cointegration in STATA.

Now you need the long run and short run coefficients it can be estimated through “ardl

depvar indepvar1 indepvar2 … , aic ec regstore(ecreg)”

Here ec will be used to generate the error correction version of the model with aic as the criterion for the lag order. The important thing is the use of restore(name) command, it will be explained later.

Here you can see the LR is the long run estimates, SR is the short run estimates and ADJ is the adjustment coefficient or the error correction coefficients. Now for the case to generate the post estimation diagnostics you need to convert the ardl estimated results to the reg format so that we can apply post estimations.

For this write the command “estimates restore ecreg” it will bring the result of the ardl ecm model into the memory of the computer. And when you write the “regress” command it will show the ecm results under regress command like below

Here you can use following commands

“estat dwatson” for the Durbin – Watson D statistics for 1^{st} order autocorrelation.

“estat archlm” for the ARCH LM test for higher order autocorrelation

“estat bgodfrey” for the Breusch Godfrey LM test for higher order autocorrelation

“estat hettest” for Breusch Pagan Heteroscedasticity test.

“estat ovtest” for Ramsey RESET test

“estat vif” for VIF test of Multicollinearity

For all these tests the decision criterion is available in the form of null or alternative hypothesis. Up-til now I am looking how to check the stability of the coefficient (CUSUM) test in STATA. Any one who knows how to do it please share. Hope this helps.

**Update:** cusum6 command can be used to generate CUSUM and CUSUMsq charts for ARDL in Stata

No Tests for stationarity?

following are all possible types of unit root test use which ever you like

Thank you Noman.

Hie Norman, is there a way of solving the omitted variable bias without dropping variables in ARDL?

Thank you Norman for this new interesting page on ARDL. I’m just learning this model and compare with some results in Microfit. And there are also some differences in the results. I have a question to be sure I understand well the output of Stata. The “ADJ” part really corresponds to the adjustment term, so parameter of term (y_{t-1}-teta’ x_{t-1}), right? I ask this because of the name of the output (it is written y_{t-1}, eg” LP L1.” ). Thank you very much.

no the adj is actual lag of dependent variable in the short run equation so it will be L.LP

Sorry, don’t know the right place to write this comment:

First, thanks for the brief note on ARDL. If it helps, regarding your last question in the note, we can do CUSUM test using the following procedure on STATA.

1. if its for the first time install the utility by running the command “ssc install cusum6”

2 . write the following command “cusum6 Y X1 X2 Year, cs(cusum) lw(lower) uw(upper)”

Stay safe

these comments can be written in the command window of STATA. it can be done before or after estimating ARDL model

I was talking about the ECM form of ARDL so

dyt=a0 + s * (L.yt – teta L.xt) + … which is just a transformation from dyt=a0+ ADJ L.yt + d L.xt+…

s and ADJ will be equal.

yes

Sorry, I am also not clear on this point. So in your model, which parameter refers to error correction in the short run? I was running ardl model and unfortunately I did not get the result for LD. Y [laged difference of the dependent variable in the short run, it only contains independent variables] though I used the same command like you use. Can you help pls?

THanks

the ECM(-1) or sometimes written as ECT(-1) is actually the L.Y variable. this is the error correction variable. secondly the LD.Y should be there. In the third picture the first variable in the SR section is the LD.Y

Thank you dear Norman,

Please, is it possible to use a dummy variable in the ECM equation. For example:

dy = a*y(-1)+b*x(-1) + a_1*dy(-1)…+a_p*dy(-p) + b_0*dx…+b_q*dx(-q) + cdum

yes you can, in most softwares there is option of adding exogenous variables you can put the dummy variable there so that i will introduce it in short run.

Hi Norman, I’ve just discovered the “cusum6” command in Stata to plot the CUSUM et cusumSQ tests!

Merci pour ton information, i have added it in the post for other readers..

Thanks a lot Mr. Norman, it is really helpful.

Hi, Nomad. Thank you for this useful information. I’m Park from Korea.

After doing “ardl, noctable btest” I’ve only got from ‘ARDL regression’ to ‘Root MSE’. I couldn’t get the bounds test results with F statistics and T statistics. How can I get that? or is there a problem with my STATA?

. ardl, noctable btest

ARDL regression

Model: level

Sample: 1991m5 – 2015m10

Number of obs = 294

Log likelihood = -319.91337

R-squared = .97642429

Adj R-squared = .97576251

Root MSE = .7296046

This is all I’ve got.

Hi

I have no idea about it, as it is a plug it may be your version of stata might not support this version kindly

write help ardl in your command window and contact the plugin maker regarding this issue

may be he can tell if there is any compatibility issue.

Regards

I’m using STATA13. I’ve sent an emil to the maker. Hope to get an answer soon. Thank you anyway.

Hi Noman. I got the answer from the maker. he told me to put, ‘ardl, depavar indvar, lags ec’ and it went well. I can now finish my work thanks to you. Thank you very much.

Glad it worked

when I run ARDL in stata i see this error ( of lag permutations (768) exceeds setting of ‘maxcombs’ (500) r(9)). Is that my sample small? only 35 obs.

this means that you might be taking to many lags and you have too many independent variables. it can be solved by increasing the matsize in stata. type help matsize for details

For people who don’t know, there is also the Stata official forum talking about the ardl command. It allows to directly ask questions to Sebastian Kripfganz, one of the maker of the command, who kindly answer questions there. http://www.statalist.org/forums/forum/general-stata-discussion/general/95329-ardl-in-stata

hope this helps.

Hi Noman! I don’t know why my CUSUM is out of the critical limits of 5% but my CUSUMSQ remains within the critical limits?

it shows that the mean of your model is unstable. you should check if there is any structural break in the model which is making model unstable.

it shows that the mean of your model is unstable. you should check if there is any structural break in the model which is making model unstable.

Can you show me CUSUM6 command after running ARDL model?

no but i have shown how to transfer the ARDL estimation into OLS framework where you can use the cusum6 command

Hi Noman! Ifeel confused when explaining short-run effect. For example, export variable have 3 lag so how can i explain when 1% export increase/decrease make gdp increase/decrese beta %. Since this variable have 3 lags so have 3 beta. How can I explain? Plus 3 betas or not?

usually we interpret the first lag only.

Am using stata 11 but am getting message that last estimates not found. Only model summary with r-square, sample size etc are showing.

You can contact the author of the code, he will be better able to help here.

Thank you Sir. I have sent an email to the author. Hope to receive a reply soon.

my model is table but hvae auto correlation. How can I fix it in stata?

Sir Noman Arshad, What to do if any of the variables is I(2) ?

take its first difference it will become I(1) then you can use it.

I run model with 3 variables in stata, but in short-run i see only 2 variable and long-run is still 3 variables. I dont know why stata dont present a variable in short-run. May be this variable have problem?

this might be because any one of them have 0 lags used as per the AIC method.

Great article !

I have a few questions that I would like you ask you about Cointegration in general.

1) If I have a mix of nonstationary and stationary, say I have 2 I(0) and 2 I(1), No variables are I(2). Can i actually apply the VAR approach then? Because Ive seen several threads about this topic and all seems to give different answers all the time. (I dont want to use first Diff of the variables)

2) Do you know any realible source of you answer about 1), I cant find any main source that you cant use VAR/VECM when you have a mix of variables of different stationarities

3) Is there any limit in the about of variables to be used in an ARDL model? Like a benchmark.

Thank you for an excellent blog by the way, I will def share it.

1) actually the inventor of VECM “Katarina” has mentioned in her book that VECM which is a special form of VAR can be used for I(0) and I(1) variables and it also illustrated the method. why people do not use mixed variables in VAR as because theoretically I(0) variable cannot be caused by I(1) variable and in VAR you will be testing it.

2) you should search the book of “The Cointegration VAR Model” by “Katarina Juselius” it is a good reference

3) there is no limit in ARDL model but the more the variables you use

the less likely there will be cointegration among them, as you would require larger sample to confirm the cointegration.

Thanks for the answers. I have one more question related to ARDL:

What happends if the ECM constant ADJ in stata turns out to be negative but not significant?

1) Can i still have short-run relationship if say one variable has a significant short run coefficient

2) same as 1) but for long-run.

It is a brilliant blog, this entry especially helped me a lot with my master thesis, thank you very much!

Is there anyone who give us detailed interpretation of the result of ARDL on stata?

yes all the research papers which have been published and used ARDL have detailed interpretation of results.

Please, the variables in my research are crude oil price (independent variable) and 6 dependent variables as total export crude oil revenue, inflation rate, unemployment rate, exchange rate, money supply and GDP . what is the model and command in Stata to run ARDL and then NARDL?

I have already provided procedure for the ARDL in the blog above. for NARDL you need to study research papers to learn how to make oil prices non linear then use those modified variables in ARDL same way as I mentioned above

Thanks for the good job. I have a question on the ECM coefficient. How do you interpret ECM coefficient higher than one, in absolute terms? For example, if the ECM coefficient is -1.67.

1% increase in random shock to equilibrium will lead to 1.67% correction in the equilibrium.

dear Arshed!

I have applied this command “ardl, noctable btest”

the result shown by stata are

. ardl, noctable btest

ARDL regression

Model: level

Sample: 1992 – 2015

Number of obs = 24

Log likelihood = -65.560497

R-squared = .98217335

Adj R-squared = .97588159

Root MSE = 4.4157121

But the bound critical values are not shown by STATA. can you guide me out

it is applied straight after the first command, if it still not working, may be the module you downloaded it faulty. contact its authors, their email ids are provided in the “help ardl” command

Hi Norman,

I get this result at the first step:

# of lag permutations (12500) exceeds setting of ‘maxcombs’ (500)

This is my model at the moment:

ardl GDPpercapitacurrentUS Grosscapitalformationconstan HumanCapital MAcondvol Domesticcredittoprivatesecto Consumerpriceindex2000100, aic

What do you suggest?

regards,

Ben

you have to increase the matsize i guess, write help matsize and learn how to increase it. this is because you have many variables in the model

Sorry was being lazy put in a maximum lag length check now (though would only work with max 2)

no problem.. if you want to increase the lag length follow the suggestion in provided earlier

Hi, when I do

ardl, notable btest

Stata doesn’t display the test statistics just the sample through to root mse,

Any way to get those statistics?

it should be applied straight after the first command, if it still not working, may be the module you downloaded it faulty. contact its authors, their email ids are provided in the “help ardl” command

Hi Noman,

Sorry for all the questions, not sure if previous questions have posted but i was struggling to get the bounds test however I have been able to get it by typing

estat btest after the ec model

(It says it can only work if model is ECM)

Anyway my main question is:

How do I interpret the btest output, I’m not 100% sure which critical value I should be using to compare with?

Thanks

Usually F test value is used. if you are using 5% criterion then you should use the I_05 stated under r[1]

Hi Noman, thanks for all the help with this.

Some more general questions, should I be concerned with heteroscedastic and non-normal residuals?

Also is there an automatic way to use cusum6 or do I need to type it in manually?

Also in some of the literature in my research topic, when using ARDL they use first lagged levels and not current levels, but this also includes the first lag level of the dependent variable? Should I be doing something like this too?

(My topic is exchange rate volatility on economic growth)

Thanks

Ben

(Or is the adjustment the lagged level of the dependent variable? writing regress it states the variable as L1.logGDP, but isn’t ECT the error from the previous period?)

yes L1.logGDP is ECT if the logGDP is dependent of your model

yes you should try to remove the hetroskedasticity but non-normal residuals can be ignored of the sample is bigger than 30.

since cusum6 is a new command so there is not automatic way, you have to do it manually.

the specification of your model depends on the literature review, both specification have merits and demerits so judge on the bases of past studies.

Dear Sir,

I’m using Stata to run an ARDL between variables that are all I(1).

Through the application of a Gregory-Hansen Test for Cointegration with Regime Shifts, I know that there is a break in the time series I have. How do I modify the ARDL specification to include it in Stata?

Some peers told me that on Eviews there is an option for non-linearities, but I’m not familiar with the program and don’t know where to find it.

Thank you for your help.

Learn how to make the structural break dummy when the regime shift is known using any criterion. and then incorporate that dummy as exogenous independent variable in the model. but this procedure is only necessary if CUSUM graphs or RESET test is showing problem in the post regression diagnostics.

Dear Noman,

Thank you for you kind answer.

My dilemma is the following, CUSUM and CUSUMSQ behave well, and we cannot reject that the model is well specified according to the RESET test.

However, the ghansen test for cointegration with regime shifts, tells me that there is either a break around the 2000q1 or the 2009q2 in the system.

Thus, I wonder if I should modify the ardl from the beginning to have time dummies on these points, and redo all tests, or disregard the completely this potential issue.

Even though ghansen shows break, but if you choose not to use that break is not creating problem in regression as per CUSUMs and RESET. so it is up to you if you want to add this or not. This is not a potential issue, it is just extra information might improve the results but it is not harmful to not to use it.

Hi Noman,

The coefficients on the long run variables are different when I change the view,

ie the coefficients are different when I use ardl compared to

estimates restore ecreg

regress,

Is that supposed to happen? The SR coefficients are the same, and the R-sq is the same as well.

Thanks

Ben

yes they will change see in my example as, in first case the long run results and adjustment coefficient are written separately but when we go to the regress form the long run coefficients change as it is not separated from the adjustment coefficients. both are correct, for the second result there is some manipulation required but in first case the manipulation is done by STATA

Hi Noman,

Okay I see, but surely all coefficients would slightly change when they’re combined?

I’ve been reporting the coefficients from the first regression, if correct will just keep that.

But to clarify, the first output is the equivalent of two models? Ie a long run model and a short run ec model? Whereas the second output is one larger model?

Thanks

no they can change more than slightly as the convergence coefficient is multiplied with the long run coefficients in second model.

Hi,

I was hoping you would help me make sense of some of the results I am getting.

I am estimating a classic relationship between export volume, real exchange rates and demand. In total, I have a dependant (the exports) and 5 explanatory variables.

All my variables are I(1), and on the short run I do get significant estimates for all of them. However, for the long run, with the ARDL I only get estimates for: LD.ln(y), L2D.ln(y), L3D.ln(y), D.ln(x1), LD.ln(x1),L2D.ln(x1),L3D.ln(x1), D.ln(x5) and LD.ln(x5).

How would I express this in the ARDL(p,q) form, should I say ARDL (3, 0, 0, 0, 1)?

And can the inclusion of a new variable absorb all the explanatory power in the long run of traditional indicators such as the REER?

Thank you very much for your help.

first of all i think the results depend on how long your sample time period is, ideally it should be more than 50. secondly, i think your variables are too few, the studies related to your model which i have seen have used the variable of export promotion policy too as independent which is very relevant for your model. lastly there is one variable missing which would represent the cost of production of the exported products, like export price index.

Respected sir, i want to know how exactly the equation is according to your results in the table where short run long run and adjusted coefficient are… please reply me as soon as possible….?? to

Hello,

I would like to get a clarification.

I’m unsure about how to interpret the results I’m getting. Basically, I’m doing a standard export regression, of exports on import demand, domestic demand and real exchange rate.

In one country I get that all these elements are significative in the long run, I have the adjust component (L. exports) and for the short run I only get: D. ln(imports), D. ln(domestic demand) and D. ln (reer). I was expecting to have more lagged values and lagged values of the exports as well in the short run.

In another country, I get that something closer to what I was expecting: In the long run, imports and rear are significative, although not domestic demand. And in the short run, I have D. LD. LD2 and L3D of ln(imports) and LD, L2D L3D of ln (exports).

I’m puzzled by the lack of lags of impact from the lags on the first regression. Do you know why that may be?

Also, I was wondering how I would write what kind of ARDL(p,q,r) process I’m getting in the second case I just mentioned.

Thank you for your help.

Best

Maria.

P.S. I hope it gets posted this time, since I have been trying to post this question 2/3 unsuccessfully through the web

The lag size depends on how the particular variable have inertia, or how sluggish it is to change, so two different countries can have different lag orders. secondly you can learn about the writing equation by studying past papers, here p means lag order of dependent, and q r are the first and second independent variables, so if you have more than 2 independent, then you will add more abbreviations after r.

How do I save the speed of adjustment coefficient (ECM), if I wanna rerun the regression, for instance as a robust regression (rreg command)?

You should consult this with the authors of the ARDL module.

After estimating the error correction model using ARDL, I want to compare the result from this estimation with the actual value to see how good this estimation is. But if I predict the fitted values from the regression in Stata I receive the predicted values from the first difference. Is there a way to get predicted values that are not first difference so I can compare them with the actual value to see how good this estimation is?

No, you can get the predicted value of level form variable for that you have to manually calculate Y-hat in excel by using the coefficient estimates in longrun.

So if I get this right, the cusum6 command scould be run with dependent variable (in first differenced form) + all variables (adjustment , long-run and short-run) that is present in the ardl model that comes out with rhe regress command?

yes it will use the residual of the ARDL equation

How can i solve collinearity problem in ardl

Select only those variables which are not highly correlated with each other.

Thanks for your post. My ARDL model passes the CUSUM test but fails the CUSUMSQ. I try adding dummy variables but the problem get worse. Could you suggest me how should I deal with that? I am measuring the long term relation between stock market and macro variables.

Find the structural break in all of the independent variables, and remove the break effect from these variables by running intercept less regression of independent variable with the structural break dummy, and use the residuals which will be the portion of independent variable free of the break. This way you model will be free of all possible breaks. This might solve the problem. Remind you that this is the last possible approach, before using it try changing independent variables or use log transformation.

Thanks for your reply. I tried log transformation but Cusumsq test failed. I ve read your post “Construction of structural break variables”. Could you write down the equation to remove break point. Thanks.

Suppose you have X as independent variable and D is its structural break dummy. Now run regression X = bD + E. After estimating the regression, extract E, it will be the portion of X which is free of D.

Thank you very much for your time and prompt reply. I’ve learned a lot from your blog. It is not something I can find in the text book or at school. Many thanks again.

I have one question following up. I found the following explanation on the Eviews blog. So, should I ignore the failure of Cusumsq test due to the explanation below or should I need to eliminate the break in the independent variables using the regression X = bD + E :

“Stability in the context of the Pesaran Shin (1998) ARDL model is indeed an important subject. They make the assumption that the ARDL model being studied is in fact stable. In this regard, if you are simply looking to estimate an ARDL model to see if the estimates are valid, you should be concerned about stability. Luckily, this is easily verified by testing whether the root of the characteristic equation are outside the unit circle. In other words, does the ARDL lag polynomial produce stationary results. Nevertheless, the Pesaran, Shin, and Smith (2001) paper is a TEST for cointegration. In other words, it must allow for the possibility that the underlying cointegrating relationship may in fact NOT be stable. In this regard, the PSS(2001) paper does not a priori impose stability of the ARDL lag polynomial. However, if cointegration does indeed exist, the ARDL model will in fact be stable!”

Thanks again,

Kate

Hi, could you suggest me some literatures eliminating the structural break in independent variables in this way above please. Many thanks.

this is called data smoothing using regression. Wel there are other techniques available to smooth the data that you can use.

If either the CUSUM or CUSUMSQUARE is found to be Unstable ( it implies then there is structural break in the data), are the estimates (long-run) invalid? If invalid, then what is the appropriate method how tho make the model stable ( in other words to make the estimates valid) if there is no longer any chance of adding new variable. (in case of testing the empirical model where variables are fixed). Please let me know with reference kindly.

the graphs show that model is sensitive to breaks. It means that if there are breaks in the data then the result will not be reliable. There are two ways either to add structural break of dependent variable in the model or increase lag order to absorb this change.

Dear Noman,

I have a silly question. If I am estimating a model with two dependent variables plus 1 lag each, plus one lag of the independent variable, do I have to look at the F critical values corresponding to k=5 or k=2?

Thanks a lot.

it is equal to the number of long run independent variables so K = 2

thanks alot for your help.

kindly help me on how to write the model given that I have generated coefficients

See past papers, the method is standard