Pseudo Code

Pseudo-coding is a useful way to organize a set of calculations before you start to write actual instructions. In many cases, if you try to jump straight into "coding", you will get things out of order, or skip a key step. In pseudo-coding, you start by writing what you need to do in plain English (or French or whatever). For instance,

Loop over possible breaks

Do unit root test given the breaks

If this is least favorable to the unit root hypothesis

Save it

End

which is a fairly general description of a unit root test allowing for breaks. You then start to replace the plain language with actual code to do the required calculations and tests. It's usually a good idea to convert the plain text to a comment so you can remember what a section is doing.

It's often possible for the first attempt to include at least some actual code, particularly for the control instructions. For instance, in

do time=1871:1,1970:1

...update particles at time given time-1...

end do time

the first and last lines are legal RATS codes, and we just need to flesh out the "update..." part. What's important here is that we've made clear that we've set the loop so that it will compute the particles at <<time>> given <<time-1>>. One common problem that many people have is to write a loop without having a clear idea of what is supposed to be done on a trip through it.

It's usually a good idea to replace the sections of pseudo-code in stages. For instance, the following is the setup generated by the simplest case of the Monte Carlo/Gibbs Sampling generator wizard (for simple Monte Carlo):

COMPUTE NDRAWS=100

* Do any calculations that aren't specific to a draw.

* Initialize anything needed for holding results of draws

DO DRAW=1,NDRAWS

* Do new draw from distribution

* Do bookkeeping here. Save results for entry DRAW

END DO

At the moment, this does nothing other than run an empty loop 100 times. In general, you shouldn't start with the "bookkeeping" since until you actually have the draws and related calculations working, there's really nothing to save. Note that the wizard intentionally (by default) generates code for a rather small number of draws. A common error is to start with a "production" number of draws (like 10000), which typically requires shutting off the output from any calculations done inside the loop. As a result, you might miss the fact that (for instance), a nonlinear estimation fails to converge or a series being generated produces NA's where it shouldn't. With a small number of draws you can use PRINT options or PRINT or DISPLAY instructions to check that you're getting the types of results that you need. Once you're satisfied that you are generating proper results, you can turn off the "debugging" output and increase the number of draws. (One of the reasons for the wizard writing the program with a COMPUTE instruction for NDRAWS is to make it easier to change that).

This generates the empirical distribution for the AR coefficient when running a random walk on its lag. (This should be skewed left, away from the true value of 1). So "new draw from distribution" requires generating a random walk. One thing that we have to decide first is how many observations to use for that. If we decide upon 250, we can put

compute nobs=250

outside the loop and

set(first=0.0) y 1 nobs = y{1}+%ran(1.0)

inside it. This is one of those situations where we could easily make a mistake—a random walk is generated recursively, and if we don't include the FIRST option (or do something similar), Y{1} won't exist when entry 1 is being computed so Y(1) will be NA, which will mean that Y(2) will be NA, etc. If we just put in a PRINT or GRAPH instruction for Y right here, we'll see whether we're actually generating data.

For each draw for the random walk, we need to run a LINREG of Y on its lag. We might now have a program which looks like

COMPUTE NDRAWS=100

compute nobs=250

* Initialize anything needed for holding results of draws

DO DRAW=1,NDRAWS

set(first=0.0) y 1 nobs = y{1}+%ran(1.0)

linreg(print) y

# y{1}

* Do bookkeeping here. Save results for entry DRAW

END DO

We've put a PRINT option on the LINREG so we can do a run through this at a low draw count and see if the results look reasonable. (We expect the coefficients to be near one, with more below than above).

Now that we're generating data and the regressions seem to work, it's time to figure out what we need to save. To compute an empirical distribution, we need to save all the draws. The easiest way to do this is to save them in a SERIES with NDRAWS elements. (You can more easily analyze the contents of a SERIES than, say, a VECTOR). The "initialize anything needed..." is now something like

set betas 1 ndraws = 0.0

which creates BETAS as a SERIES and makes sure it has NDRAWS elements, and the "Do bookkeeping here..." is

compute betas(draw)=%betas(1)

This last step often creates confusion. Note that the comment is to "save results for entry DRAW". A common error is to do something which saves information for more than just entry DRAW. For instance,

set betas = %beta(1) (don't do this!!!)

overwrites the entire BETAS series, so all you would get is a series filled with just the final value. Note that

set betas draw draw = %beta(1)

would be OK as it only changes the one element. We now would have

COMPUTE NDRAWS=100

compute nobs=250

set betas 1 ndraws = 0.0

DO DRAW=1,NDRAWS

set(first=0.0) y 1 nobs = y{1}+%ran(1.0)

linreg(print) y

# y{1}

compute betas(draw)=%beta(1)

END DO

As a quick check that we're getting the numbers out correctly, we can do

stats(fract) betas

after the loop. We would hope to see something like (actual numbers will vary, since this uses random draws):

Statistics on Series BETAS

Observations 100

Sample Mean 0.991939 Variance 0.000170

Standard Error 0.013022 SE of Sample Mean 0.001302

t-Statistic (Mean=0) 761.766092 Signif Level (Mean=0) 0.000000

Skewness -2.087126 Signif Level (Sk=0) 0.000000

Kurtosis (excess) 6.865768 Signif Level (Ku=0) 0.000000

Jarque-Bera 269.013096 Signif Level (JB=0) 0.000000

Minimum 0.925222 Maximum 1.008803

01-%ile 0.951259 99-%ile 1.007674

05-%ile 0.967839 95-%ile 1.005666

10-%ile 0.976766 90-%ile 1.004053

25-%ile 0.986265 75-%ile 1.000846

Median 0.994733

We should see 100 observations, with variation among them (the error cited above of overwriting the results would show up here with a zero variance). Now that it looks like we have everything working, we can increase the value of NDRAWS and replace PRINT with NOPRINT on the LINREG inside the loop. We then need to decide what we want to do with the generated values of BETAS. We add code to compute (with DENSITY) and graph the empirical distribution. The final program looks like

COMPUTE NDRAWS=10000

compute nobs=250

set betas 1 ndraws = 0.0

DO DRAW=1,NDRAWS

set(first=0.0) y 1 nobs = y{1}+%ran(1.0)

linreg(noprint) y

# y{1}

compute betas(draw)=%beta(1)

END DO

density(smoothing=1.5) betas / xb fb

scatter(style=lines)

# xb fb