Newbie seeking help

Reese2101 · Unread post by **Reese2101** » Mon Oct 24, 2011 3:16 am

Hi, I'm totally new at this programme having started using it yesterday. I am faced with a simple problem of having to simulate a linear model:

Yi=Beta1+Beta2Xi+Ui

where Xi and Ui are independent and i.i.d. And Beta1 must be set equal to zero and Beta1 equal to 1.
I am supposed to draw the Xi's and the Ui's from normal distributions with Xi~iidN(0,1) and U~iidN(0,2) and then investigate by simulation whether the OLS estimator, Beta2hat, of Beta2 is consistent and whether n^(1/2)(Beta2hat-Beta2) converges in distribution.
My plan would be to make a do loop for 10000 simulation with different number of observations. First n=100, then n=1000, n=10000 and so on. And then I would like to draw histograms for Beta2hat and n^(1/2)(Beta2hat-Beta2) to get an indication of the consistency and of the asymptotic distribution of Beta2hat.

As I said I'm totally new at this so bear with me. My code so far is:

Code: Select all

allocate 100

compute beta1=0, beta2=1, sigmax=1,sigmau=2
compute ndraws = 10000

do draw=1,ndraws

set x = %ran(sigmax)
set u = %ran(sigmau)

set y = beta1+beta2*x+u

linreg(noprint) y 
# constant x

end do draw

As you can see I am only still at doing the loop. It reports an error message SX22.
What am I doing wrong here??!!
I am a bit confused about the difference between the compute and set instructions. I'm not yet sure when to use which so that may be the problem. But the error message indicates that the error is to be found on line 9 of the loop, that is, at the linreg instruction.
I have also attached the code so that you can execute it, change it and upload it back containing your answer in case that is easier.

I'm really stuck

so hoping to get some help.

TomDoan · Unread post by **TomDoan** » Mon Oct 24, 2011 11:47 am

What you posted here is correct. What's on the attachment isn't. You want to use SET to define X and U as you're doing here, and not COMPUTE as you're doing in the attachment.

Also, you should use 0.0, 1.0 and not 0 and 1 in defining the coefficients and standard deviations. As you have those written, you're defining integers, and you won't be able to redefine them as 0.5 or something like it (if you wanted to).

The Johnston and DiNardo textbook example johnp351.rpf is the closest example we have to what you're trying to do. It's doing 2SLS rather than OLS, but it's the saving and processing of the coefficients that you need to add to what you have, and that's the same in both cases.

Reese2101 · Unread post by **Reese2101** » Tue Oct 25, 2011 11:52 am

Hi, Tom. Thank you very much for helping me here. I changed the code a little bit so that it looks like this:

Code: Select all

compute ndraws = 10000
compute iobs=100

allocate iobs

compute beta1=0 
compute beta2=1

do draws=1,ndraws

set x = %ran(1.0)
set u = %ran(2.0)

set y = beta1+beta2*x+u

linreg(noprint) y
# constant x

compute betahat=%beta(2)
compute betahatcon=(iobs**(1/2))*(betahat-beta2)

end do draw

It seems to work and it generates 10000 beta's and "scaled" beta's.
However, firstly I tried to change the expression to a real when using the compute instruction for the beta's as I believe you meant for me to do. But then it reports a syntax error claiming it expected an integer and not a real. How and should I change this?
Secondly, I did what you said and looked up example johnp351.rpf. As in that example I used the compute instruction compute betahat=%beta(2). But in the example he (claiming it's a guy who has done the coding) is able to use the statistics instruction on betahat. But the instruction requires a series and not a real and it of course reports an error to me. My question is now, if I'm supposed to create some kind of series for betahat and how? I'm not sure if what I'm doing wrong has something to do with the fact that I'm not doing what he is doing in his example where he writes:

set b 1 ndraws = 0.0

and

compute b(draws)=%beta(1)

Especially, whether the parenthesis (draws) changes something in his code such that he can use the statistics instruction.

Thirdly, he uses the statistics instruction after the do loop. If I use the diplay instruction after my do loop, for example, I can see that it spits out a single value. But I need statistics in the collection of all the generated beta's. Am I missing something here?

Hope to hear from you soon. As I said I have only just started using the programme a couple of days ago so I'm still learning. And also, I'm Danish in case you wondering about all the possible mispellings and grammar mistakes.

TomDoan · Unread post by **TomDoan** » Tue Oct 25, 2011 1:35 pm

The integer vs real is because you have:

compute beta1=0
compute beta2=1

That defines beta1 and beta2 as integers since the RHS values are integers. Change this to

compute beta1=0.0
compute beta2=1.0

(or put DECLARE REAL BETA1 BETA2 on a line before the COMPUTE instructions) and they'll be real-valued and you can change them to other non-integers.

The comments in the Johnston example explain why "B" (and "TSTAT") are set up the way they are:

Code: Select all

*
* The 2SLS coefficients and t-statistics are put into "ndraws" entries
* of the series b and tstat. It's often easier to analyze the output
* when it's in a series.
*
set b 1 ndraws = 0.0
set tstat 1 ndraws = 0.0

You want to do exactly the same type of thing, just saving different statistics. For each statistic that you want to save, do something like

set mystat 1 ndraws = 0.0

before the loop and

compute mystat(draws)=calculation of mystat

inside the loop.

Reese2101 · Unread post by **Reese2101** » Thu Oct 27, 2011 5:12 am

Hi, Tom. Thank you for the reply.
I have changed the code like you said such that I use the set instruction before the loop and compute inside the loop. I seems to work for the beta coefficients. However, I am also supposed to create the expression:

sqrt(n)*(betahat-beta)

and see whether that converges in distribution by creating a histogram for it besides the histogram for beta2.
In the code you can see that I declare a vector nbeta2 with 10000 places to accomodate for the fact that the expression above requires correct dimensions. My code is now (as you can see I have also begun to graph the histogram of beta2 but I'm not finished with that since I have been trying to graph a normal distribution in the same picture using the example on page 229 in the Reference manual; I havent got that to work quite yet):

Code: Select all

compute ndraws=10000
compute iobs=100

allocate iobs

set beta1 1 ndraws = 0.0
set beta2 1 ndraws = 1.0
declare vector nbeta2(10000)

do draws=1, ndraws

set x = %ran(2.0)
set u = %ran(1.0)
set y = beta1+beta2*x+u

linreg(noprint) y
# constant x

compute beta2(draws)=%beta(2)
compute nbeta2=(sqrt(iobs))*(%beta(2)-beta2)

end do draws

statistics beta2

density(type=histogram) beta2 1 ndraws xbeta2 fbeta2
spgraph
scatter(style=bargraph,overlay=line,ovsamescale, header="Case 1. Distribution of the beta 2 estimate with i=100")
# xbeta2 fbeta2

display(store=s) "Skewness" %skewness $
"\\Excess Kurtosis" %kurtosis
grtext(position=upright) s

spgraph(done)

However, it comes up with an error message MAT2 stating problems with the dimensions involved in the subtraction operation inside the loop. As far as I can see the problem is that %beta(2) is an 10000x1 vector but my beta2 which I have set before the loop is a scalar. Is that right and what am I doing wrong then? I would like to have a series which I can graph just like the beta estimates.

Aside from this simulation I am supposed to do the same for two other cases. On in which the x's are still normally distributed but where the u's are drawn from a t-distribution. I have attached that code as Case 2. The problem is that I get a really strange picture of the estimates of beta. It's more or less just a degenerate distribution. The only thing I can draw upon would be the book by Hamilton(1994) on page 209-209 where he says that if the x's are stochastic and the errors are non-Gaussian the unconditional distribution of beta will be non-Gaussian but that the estimator will be consistent. And I can certainly see that what I have graphed is non-Gaussian!! but certainly also non-consistent since it seems to center around 0 instead of 1.

Case 3 involves drawing the u's from a normal distribution but specifying the x series as a random walk:
x=x{1}+epsilon
I have replicated a little bit from the code of Johnp351.prg which you referred me to since I solve and express the x series as a sum of the epsilons. I do this instead of setting x=x{1}+epsilon because the programme gets confused about where to find the x series.
I attached that code as well. The problem is that i get the same consistency problem here as in case 2 since it centers around 0 instead of 1.

I know this is a lot to state in one reply and it's not like you have to do the assignment for me. If you can just check if I'm doing something terribly wrong. Let me sum up my questions.

1) In all cases, how can I generate an expression sqrt(n)*(betahat-beta) and picture it like the beta estimates.
2) In case 2, how come I get such a strange looking figure with total mass around 0 instead of 1. Hamilton writes that it will be unbiased and non-Gaussian. Non-Gaussianity seems to be fulfilled but my estimates look really biased.
3) I case 3, am I doing the code right when generating the random walk x series and how come I get a strange looking figure as in case 2? I know that Hamilton writes that I'm supposed to get biasedness with dependent series. But could I be doing something wrong?

TomDoan · Unread post by **TomDoan** » Thu Oct 27, 2011 10:35 am

Ask yourself the following.

1. What are the values which are fixed as part of the setup of the problem?
2. What are the values which are generated for each draw that you need to save and analyze later?

The answer to #1 is beta1 and beta2. Those are single values, not series. You put in their values using COMPUTE, that is

compute beta1=0.0
compute beta2=1.0

as I said in a previous post.

The answer to 2 is not beta2. beta2 is the fixed number in the data generating process. You can call the estimate beta2hat or something like that, but you cannot call it beta2. It looks like you want to save the beta2hats, and the sqrt(n)*(beta2hat-beta2) values for each draw. These will go into SERIES (don't make them VECTORS; it's easier to use them as SERIES), so you can analyze them after you've done all your simulations. So you want to do

set beta2hat 1 ndraws = 0.0
set nbeta2 1 ndraws = 0.0

before the loop. This makes them into series with the proper length. Then inside the loop, you need

compute beta2hat(draws)=expression that you want to save
compute nbeta2(draws)=expression that you want to save

In your last attempt, you were overwriting the values of beta2 because you had them as series rather than scalars, and you weren't saving the nbeta2 in separate entries.

Reese2101 · Unread post by **Reese2101** » Thu Oct 27, 2011 2:06 pm

Thanks again for the help.Feeling like a moron here. I think I may see what you're getting at. I've rewritten the code and attached it. As you can see I have also tried to graph the normalized betahats.

I've done the same to the code for the two other cases. But even if the code is correct I get weird results for the two other cases.
With case 2 where I only change the fact that the u's are now drawn from a t-distribution I still get a "degenerate" distribution around 0. Also when it's normalized. So I must be doing something wrong...or not?
With case 3 where I change the x's to be a random walk the histogram of beta2hat is centered around 0 and the normalized version around -7. I made a third variable where I multiply by n instead of sqrt(n) but this gets even worse.

I'm sure that I'm still messing up but I feel like I'm getting closer.

TomDoan · Unread post by **TomDoan** » Thu Oct 27, 2011 3:38 pm

The weird results on the 2nd one are actually what you should be seeing. Since this is a homework assignment, you'll have to figure out why. However, you're misinterpreting the graph. The "spike" that you see is just the y-axis. The actual histogram is so flat over such a wide range that it can't even be seen as separate from the x axis.

Your third is almost correct, except you need a SET X = EPS, not COMPUTE X = EPS as the third line of this:

Code: Select all

set(first=0.0) eps = %ran(1.0)
accumulate eps
set x = eps

However, you can create X with the desired properties in a single instruction with

set(first=0.0) x = x{1}+%ran(1.0)

Reese2101 · Unread post by **Reese2101** » Sun Oct 30, 2011 3:38 am

Once again, thank you very much for the help Tom. I changed the instruction to set instead of compute in case 3.
And of course the strange looking figure in case 2 must come from the fact that I draw from a Cauchy distribution which as far as I remember does not have finite moments. Especially the variance is non-finite and that would influence the variance of my estimator.
I am just wrapping up what I have. But I have one final question. I seem to recall that in case 3 the expression: sqrt(T)*(betahat-beta) should approach a degenerate distribution when I'm faced with a unit root. But the expression T*(betahat-beta) should approach a non-degenerate distribution which is non-degenerate. The Dickey Fuller distribution. This should be non-normal but skewed to the left.
In the code I also generate the variable T*(betahat-beta) and draw the histogram. But it does not seem to get negatively skewed with sample size.
So I'm almost certain that I still must be doing something wrong?

TomDoan · Unread post by **TomDoan** » Mon Oct 31, 2011 11:14 am

You would get the Dickey-Fuller distribution if the X is the lag of Y. That's not the case here. Instead, this is an "Engle-Granger" regression; by construction Y and X are cointegrated and the regression estimates are known to be "superconsistent".

The RATS Software Forum

Newbie seeking help

Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help

Re: Newbie seeking help