Estimating Non-Linear ARDL in STATA

In my previous try on ARDL cointegrating bounds using Microfit here, Eviews here and here, and using STATA here. The comments and suggestions I received for them were very helpful. With my current experience, I would recommend using Microfit or Eviews for ARDL, but one must be cautious with calculation glitches when they are using the crack version of Eviews.

This blog is illustrating the Non-linear ARDL cointegrating bounds which is also called Asymmetric Effects ARDL (NARDL) proposed by (Shin, Yu & Greenwood-Nimmo, 2014).  The idea behind this model is questioning the standard assumption of symmetric estimates, by which the effect of increasing of a variable is equal and opposite to the decreasing of the same variable. There are few cases mentioned in the above study like creation and destruction of jobs in boom and recession.

In the example below the nardl_data is unemployment (dependent variable) and industrial production index (independent variable). You can import this data into Stata by simply copying and pasting in data editor (tutorial).

Once imported, you have to indicate Stata that data is time series for this following command is used

tsset time

This way all the time series command will become functional. In order to estimate the NARDL following files must be downloaded, uncompressed, and paste Stata/ado/base/n folder where ever it is installed, it will then work in Stata. Following is the command

In the command below p() and q() are the number of lags of dependent and independent variable used. You can identify optimal lag by using ‘varsoc’ command in Stata, illustrated here.

nardl un ip, p(2) q(2)

Above table is standard one step ECM, the first coefficient is the convergence coefficient. and x1 is the first independent variable where x1p is the increasing portion of x1 and x1n is the decreasing portion of x1.

Below is the F bounds test, here it is 2.22, its critical values are same as the simple ARDL cointegrating bounds. Can be seen from the following paper. Currently, it is smaller than critical values.

Below table shows the long run increasing and decreasing effect of independent variable on the dependent variable. When the independent variable increases it decreases unemployment by 14.71% but when independent variable decreases, it increases unemployment by 48.69%.

The long run asymmetry and short run asymmetry is tested using F test. Since only long run F test is significant so there is only long run asymmetry.

After estimating the model, there are four types of diagnostics reported, since all of them are insignificant, so there is no autocorrelation, heteroscedasticity, misspecification and non-normality respectively.

We can also generate the graph by adding the ‘plot’ option in command and further confidence interval by using bootstrap and level option. The horizon option will identify how many years the graph will be constructed.

nardl un ip, p(2) q(4) plot horizon(40) bootstrap(100) level(95)

in the above figure, we can see that decrease in IP(industrial production) has a positive effect on UN(unemployment) shown by red line. While increasing IP has a temporary negative effect on UN shown by the green line. And the blue line showing the increasing trend of asymmetry with time.

Your comments and suggestions are welcome.






34 thoughts on “Estimating Non-Linear ARDL in STATA

  1. Maruf ahmed says:

    Dear Author,
    I would like to know what should be done when the CUSUM and CUSUMQ are unstable( more precisely, if there is structural break in the data), how can the ARDL model be run in that case? Would you please like to describe the ARDL procedure for that case.
    Thanks. All the best.

  2. ayu citra muthia says:

    before i processed the nardl_test, i should do the nardl.ado right? I would like to know why the nardl couldn’t be processed? after i click do the nardl.ado, i click do the nardl_test and program said unable to change to /adodev/nardl. why? I tried to processed whitout nardl.ado file and just used narl_test file, when in nardl un ip if tin(1983m3,2003m11), p(12) q(5), stata said unrecognized command: nardl. I hope you answer my question because i am so confused how to process nardl and get cumulative dynamic effect. Thank you so much

    1. ayu citra muthia says:

      when i processed syntax varlist in nardl.ado, it said varlist required. Why? and what should i do with the red word such as “you must provide at least 2 variables”? i should change or delete that words? and when i processed generate dependent variable and regressors local thisvariable: word 1 of `varlist’ qui gen _y = `thisvariable’, it said invalid syntax. What should i do? Thanks

      1. ayu citra muthia says:

        I would like to know how to choose the number of L in constrain? Because i saw that the L number was so random. For example: constraint 1 L2._dy L3._dy L4._dy L5._dy L6._dy L7._dy L8._dy L9._dy L10._dy, constraint 2 L1._dx1p L3._dx1p L4._dx1p, constraint 3 L1._dx1n L2._dx1n L3._dx1n. Thank you so much

  3. ayu citra muthia says:

    I would like to know in this blog before you used nardl un ip p(2) q(2), and after that you used p(2) q(4) for nardl un ip, p(2) q(4) plot horizon(40) bootstrap(100) level(95). why the number of p was changed? Thank you

  4. khan says:

    Dear Sir Noman!
    Thank you very much for the presentation of NARDL. Its of great and highly appreciated.
    Secondly, i would like to ask you that you mention in this tutorial that Y L1 is actually the ECM convergence coefficient. But as Y is independent variable so isn’t it the dependent variable lag 1 coefficient? Also you estimated the default model using variables as UN IP. But the below positive nad negative coefficients are different in the above-estimated NARDL model. If i am wrong please correct me. Kindly need some guidance in this regards.
    Its highly appreciated.

    1. Noman Arshed says:

      in the example, Y is the dependent variable. This difference is the value is the calculation done by the module as proposed by the paper. they are different because when we calculate the long-run coefficients, we have to divide them with the convergence coefficient, that is why the value of IP is different in both tables.

  5. Sarah Maskri says:

    first of all I want to thank you sir for this interesting blog, and then I would like to ask you when we have 2 independent variable how we write the command, for example inflation is the dependent variable while oil price and real gdp are the independent variables so how to specify the command and specially the lags;
    note : I didn’t found such case in nardl help so I hope you will help with this.

      1. Sarah Maskri says:’s working but I still have a problem, when the results are shown there was just the regression results and the asymmetric statistics while the cointegration test statistic and the model diagnostic were absents, so what is the problem and what to do to get them ?

    1. ayu citra muthia says:

      1. If i had 3 exogenous variables and i used varsoc to know the maximum lag, and the result showed different lag in every variable. For example, X1=4, X2=1, X3=2, and dependent variable=2. How can i write in do file to p() and q()? Is it p(2) q(4)?
      2. Is there any effect from the different lag of varsoc in every variables to constrains?
      Thank you so much for your help

      1. Noman Arshed says:

        use the maximum P and maximum Q then the estimation will find the optimal from the selected range. Yes, different lag order can have a different effect if the model is sensitive to lag order.

      1. ayu citra muthia says:

        But how many lag in maximum P and maximum Q? And i want to know, is it true the function of constrain is to selected lag order in first difference variables, so the variables can have bigger p-value than before? Thank you sir.

  6. belkhir says:

    hi thank you very much for answer me, i have another question
    when we take model NARDL is it mean’s our model (non linear), f(x) = x1*x2 not f(x)=x1+x2+ut

  7. Ayu cm says:

    Before we estimate nardl, Should we check serial correlation, heteroscedasticity, and ramsey test? And how to solve if our model has serial correlation or the ramsey test significant in 5%? Thank you

    1. Ayu cm says:

      But how if p-value of heteroscedasticity, serial correlation, or ramsey test was significant? It means the model had hetero, serial correlation, or t stable? Thank you

      1. Ayu cm says:

        But how if p-value of heteroscedasticity, serial correlation, or ramsey test was significant? It means the model had hetero, serial correlation, or not stable? Thank you

      2. Noman Arshed says:

        yes if they are significant it means there is a problem for that you need to study the null and alternative hypothesis used by the maker of this manual.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s