PREGRESS
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Apologies for that.
I am trying to be explicit now. What I have is this:
Taking January as an example. January has 31 days. I have a series of forecast. The first one beginning on 1st January for 31st January. Then I have a forecast made on 2nd January for 30th January. Going on I have a forecast made on 30th January for 31st January. Then I have a realised value on 31st January.
Similarly in February, I have a forecast made on 1st February for 28th February and so on. Then I have a realised value on 28th February.
In this way I have 27 months observations, where in each month I have the same scenario.
I create a series in which I subtract the successive values. As forecast for 31st January was made on both 1st January and 2nd January I subtract them to see what update people are making every day (something like a revision everyday people are making for the forecast at the end of the month). Then I subtract 2nd and 3rd January and so on. So I have a daily series of this “revision”. Then I do the same for February. Then for all the months. Suppose this is whole column is X.
Then I create another series where I subtract each forecast from the end of the month realised value. In this way I get another daily series. That means subtracting 1st January from realised value on 31st January then subtracting 2nd January from 31st January and so on. Similarly for February I subtract the forecast made on 1st February for 28th February, then forecast made on 2nd February for 28th February from the end of month realised value and so on. I do this for all the 27 months. Suppose this is whole column is Y.
Then I have a daily data for 27 months on another variable. And let it this whole column be Z.
I want to regress Y on X and Z.
Now as the data is not in a smooth format, I can’t perform a pure OLS.
Each month here is separate.
So I did a PFORM to create a uniform series, and then using Method=Pooled I regressed Y on X and Z (the column Z was not written when the program was posted for you).
I got stuck here that I am not being able to make RATS understand that all the months are separate and they have different number of observations. Something Like pooled cross-section time series. The time structure needs to be preserved.
I am trying to be explicit now. What I have is this:
Taking January as an example. January has 31 days. I have a series of forecast. The first one beginning on 1st January for 31st January. Then I have a forecast made on 2nd January for 30th January. Going on I have a forecast made on 30th January for 31st January. Then I have a realised value on 31st January.
Similarly in February, I have a forecast made on 1st February for 28th February and so on. Then I have a realised value on 28th February.
In this way I have 27 months observations, where in each month I have the same scenario.
I create a series in which I subtract the successive values. As forecast for 31st January was made on both 1st January and 2nd January I subtract them to see what update people are making every day (something like a revision everyday people are making for the forecast at the end of the month). Then I subtract 2nd and 3rd January and so on. So I have a daily series of this “revision”. Then I do the same for February. Then for all the months. Suppose this is whole column is X.
Then I create another series where I subtract each forecast from the end of the month realised value. In this way I get another daily series. That means subtracting 1st January from realised value on 31st January then subtracting 2nd January from 31st January and so on. Similarly for February I subtract the forecast made on 1st February for 28th February, then forecast made on 2nd February for 28th February from the end of month realised value and so on. I do this for all the 27 months. Suppose this is whole column is Y.
Then I have a daily data for 27 months on another variable. And let it this whole column be Z.
I want to regress Y on X and Z.
Now as the data is not in a smooth format, I can’t perform a pure OLS.
Each month here is separate.
So I did a PFORM to create a uniform series, and then using Method=Pooled I regressed Y on X and Z (the column Z was not written when the program was posted for you).
I got stuck here that I am not being able to make RATS understand that all the months are separate and they have different number of observations. Something Like pooled cross-section time series. The time structure needs to be preserved.
Re: PREGRESS
Your data set has quite a few missing values, sometimes several in a row and the missing values aren't consistently placed from one series to another. How are you planning to handle that?
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Thanks.
When you suggested the indexing with respect to time
SET YM = %year(t)*12+%month(t)
in your previous comments, then it specifically indexed the number of observations in a way such that each month became separate from one another.
I am planning to use the BLOCK command or the GROUP command to tell RATS that data in each BLOCK or GROUP belongs to each month. Something like below.
PREG(EFFECTS=YM,BLOCK=YM,METHOD=POOLED,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)
or
PREG(EFFECTS=YM,GROUP=YM,METHOD=POOLED,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)
I am confused should I use BLOCK or GROUP.
Using either of them helps me not to write the following command in the beginning of my program
CALENDAR(PANELOBS=21)
ALLOCATE 27//21
When you suggested the indexing with respect to time
SET YM = %year(t)*12+%month(t)
in your previous comments, then it specifically indexed the number of observations in a way such that each month became separate from one another.
I am planning to use the BLOCK command or the GROUP command to tell RATS that data in each BLOCK or GROUP belongs to each month. Something like below.
PREG(EFFECTS=YM,BLOCK=YM,METHOD=POOLED,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)
or
PREG(EFFECTS=YM,GROUP=YM,METHOD=POOLED,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)
I am confused should I use BLOCK or GROUP.
Using either of them helps me not to write the following command in the beginning of my program
CALENDAR(PANELOBS=21)
ALLOCATE 27//21
Re: PREGRESS
It sounds like what you really want is to keep the data in the original daily form, then do
SET YM = %year(t)*12+%month(t)
linreg(cluster=ym) ...
# ...
That will do an OLS regression (METHOD=POOLED on PREG is computationally identical to that), but with clustered standard errors based upon the month. Newey-West really isn't designed to handle the data with all the gaps that you have.
SET YM = %year(t)*12+%month(t)
linreg(cluster=ym) ...
# ...
That will do an OLS regression (METHOD=POOLED on PREG is computationally identical to that), but with clustered standard errors based upon the month. Newey-West really isn't designed to handle the data with all the gaps that you have.
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Thanks.
Exactly, I want to keep the data in the original daily form.
If I do
SET YM = %year(t)*12+%month(t)
linreg(cluster=YM) ...
# ...
then do I still need to do a separate PFORM?
If I don’t need to do PFORM then the final program is something like this:
CALENDAR(D) 2011:10:01
ALL 27//2013:12:30
OPEN DATA "D:\XYZ.XLSX"
DATA(FORMAT=XLSX,ORG=COLUMNS) 2011:10:01 2013:12:30 X Y Z
SET YM = %year(t)*12+%month(t)
INSTRUMENTS CONSTANT X{1} Z{1}
LINREG(CLUSTER=YM,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)Y
#CONSTANT X Z
Or am I not supposed to use LWINDOW=NEWEY in the above equation?
as what I can see, if I run the regression without LWINDOW=NEWEY now, nothing is changing
I am not keen on doing METHOD=POOLED on PREG, but suppose I manually take out the missing rows. Then is PREG valid for the thing I am trying to achieve? Although in PREG I guess doing PFORM is compulsory.
Exactly, I want to keep the data in the original daily form.
If I do
SET YM = %year(t)*12+%month(t)
linreg(cluster=YM) ...
# ...
then do I still need to do a separate PFORM?
If I don’t need to do PFORM then the final program is something like this:
CALENDAR(D) 2011:10:01
ALL 27//2013:12:30
OPEN DATA "D:\XYZ.XLSX"
DATA(FORMAT=XLSX,ORG=COLUMNS) 2011:10:01 2013:12:30 X Y Z
SET YM = %year(t)*12+%month(t)
INSTRUMENTS CONSTANT X{1} Z{1}
LINREG(CLUSTER=YM,INSTRUMENTS,LWINDOW=NEWEY,LAGS=1)Y
#CONSTANT X Z
Or am I not supposed to use LWINDOW=NEWEY in the above equation?
as what I can see, if I run the regression without LWINDOW=NEWEY now, nothing is changing
I am not keen on doing METHOD=POOLED on PREG, but suppose I manually take out the missing rows. Then is PREG valid for the thing I am trying to achieve? Although in PREG I guess doing PFORM is compulsory.
Re: PREGRESS
No need for the PFORM. And no. With the clustered standard errors, you don't use Newey-West---they're two distinct ways to deal with the serial correlation in the errors. (If you use both, CLUSTER takes precedence, so the LWINDOW=NEWEY gets ignored).
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Thank you.
I am using the above program. My data starts at 01/10/2011. But RATS starts reading data from 10/10/2011. Again it is not reading the last 3/4 observations. Is there any reason for that?
Sometimes I am finding it reads more observations than what I actually have, if I use many lags as instruments.
Rest appears to be fine right now, except for the above mentioned problems.
I am using the above program. My data starts at 01/10/2011. But RATS starts reading data from 10/10/2011. Again it is not reading the last 3/4 observations. Is there any reason for that?
Sometimes I am finding it reads more observations than what I actually have, if I use many lags as instruments.
Rest appears to be fine right now, except for the above mentioned problems.
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Hello!
I am trying to do HAUSMAN TEST, but I am always getting the error message of redundant restrictions. It's been highlighted in RED below. I am attaching my RATS Program as well. I am not being able to get rid of the error message.
OPEN DATA "CXYZ.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 503 INDEX X Y Z
LINREG(CLUSTER=INDEX) Y
#CONSTANT X Z
COMPUTE xxols=%xx,betaols=%beta,seesqols=%sigmasq
INSTRUMENTS CONSTANT X{1 TO 2} Z{1}
LINREG(CLUSTER=INDEX,INSTRUMENTS) Y
#CONSTANT X Z
COMPUTE CVDIFF=seesqols*(%xx-xxols)
TEST(TITLE="Hausman Test",ALL,form=chisquared,vector=betaols,covmat=cvdiff)
Linear Regression - Estimation by Least Squares
With Clustered Standard Error Calculations
Dependent Variable Y
Usable Observations xxx
Degrees of Freedom xxx
Centered R^2 xxx
R-Bar^2 xxx
Uncentered R^2 xxx
Mean of Dependent Variable xxx
Std Error of Dependent Variable xxx
Standard Error of Estimate xxx
Sum of Squared Residuals xxx
Log Likelihood xxx
Durbin-Watson Statistic xxx
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant xxx xxx xxx xxx
2. X xxx
3. Z xxx
Linear Regression - Estimation by Instrumental Variables
With Clustered Standard Error Calculations
Dependent Variable Y
Usable Observations xxx
Degrees of Freedom xxx
Mean of Dependent Variable xxx
Std Error of Dependent Variable xxx
Standard Error of Estimate xxx
Sum of Squared Residuals xxx
J-Specification(1) 0. xxx
Significance Level of J 0. xxx
Durbin-Watson Statistic xxx
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant xxx
2. X xxx
3. Z xxx
Hausman Test
## X13. Redundant Restrictions. Using 2 Degrees, not 3
Chi-Squared(2)= xxx or F(2,*)= xxx with Significance Level xxx
I am trying to do HAUSMAN TEST, but I am always getting the error message of redundant restrictions. It's been highlighted in RED below. I am attaching my RATS Program as well. I am not being able to get rid of the error message.
OPEN DATA "CXYZ.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 503 INDEX X Y Z
LINREG(CLUSTER=INDEX) Y
#CONSTANT X Z
COMPUTE xxols=%xx,betaols=%beta,seesqols=%sigmasq
INSTRUMENTS CONSTANT X{1 TO 2} Z{1}
LINREG(CLUSTER=INDEX,INSTRUMENTS) Y
#CONSTANT X Z
COMPUTE CVDIFF=seesqols*(%xx-xxols)
TEST(TITLE="Hausman Test",ALL,form=chisquared,vector=betaols,covmat=cvdiff)
Linear Regression - Estimation by Least Squares
With Clustered Standard Error Calculations
Dependent Variable Y
Usable Observations xxx
Degrees of Freedom xxx
Centered R^2 xxx
R-Bar^2 xxx
Uncentered R^2 xxx
Mean of Dependent Variable xxx
Std Error of Dependent Variable xxx
Standard Error of Estimate xxx
Sum of Squared Residuals xxx
Log Likelihood xxx
Durbin-Watson Statistic xxx
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant xxx xxx xxx xxx
2. X xxx
3. Z xxx
Linear Regression - Estimation by Instrumental Variables
With Clustered Standard Error Calculations
Dependent Variable Y
Usable Observations xxx
Degrees of Freedom xxx
Mean of Dependent Variable xxx
Std Error of Dependent Variable xxx
Standard Error of Estimate xxx
Sum of Squared Residuals xxx
J-Specification(1) 0. xxx
Significance Level of J 0. xxx
Durbin-Watson Statistic xxx
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant xxx
2. X xxx
3. Z xxx
Hausman Test
## X13. Redundant Restrictions. Using 2 Degrees, not 3
Chi-Squared(2)= xxx or F(2,*)= xxx with Significance Level xxx
Last edited by smukherjee01 on Fri Aug 01, 2014 7:27 pm, edited 1 time in total.
Re: PREGRESS
Good. If you didn't, you would have been doing something wrong. We'll have to make it more emphatic, but see point #3 on page UG-95 of the v8 User's Guide. In this case, because the CONSTANT is included in both the regression and the instrument set, the difference between the two estimators is rank two, not three. (Both force the residuals to sum to zero).smukherjee01 wrote:Hello!
I am trying to do HAUSMAN TEST, but I am always getting the error message of redundant restrictions. It's been highlighted in RED below. I am attaching my RATS Program as well. I am not being able to get rid of the error message.
BTW, don't scale the difference in covariance matrices by SEESQOLS. The scales are already incorporated when you use the CLUSTER option.
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Thank You very much. While writing the program for you, I missed one instrument. Apologies for that. I edited it. But your reply matches the corrected program above.
Can I ask you something, please? I saw in one of the topics (maybe in 2011) you suggested the Driscoll Kraay Standard errors program to someone (you also attached the file). I guess he/she was looking to use it for panel data.
I am not doing panel data estimation, but is there any file which shows how I can use Driscoll Kraay for my type of regression (in linreg)? Some studies do tend to use Driscoll Kraay standard errors. I wanted to use it for my case.
Can I ask you something, please? I saw in one of the topics (maybe in 2011) you suggested the Driscoll Kraay Standard errors program to someone (you also attached the file). I guess he/she was looking to use it for panel data.
I am not doing panel data estimation, but is there any file which shows how I can use Driscoll Kraay for my type of regression (in linreg)? Some studies do tend to use Driscoll Kraay standard errors. I wanted to use it for my case.
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
TomDoan wrote:No need for the PFORM. And no. With the clustered standard errors, you don't use Newey-West---they're two distinct ways to deal with the serial correlation in the errors. (If you use both, CLUSTER takes precedence, so the LWINDOW=NEWEY gets ignored).
Using LWINDOW=NEWEY helped me experiment with different LAGS. Doing that it helped me get bigger/smaller standard errors. Using the CLUSTER command, I am getting standard errors which are bigger but not big enough. Is there any way I can make my standard errors bigger, like using any command after CLUSTER=YM, ? As suggested by you, I can't use LWINDOW=NEWEY command after CLUSTER.
Re: PREGRESS
You want to use the @REGPCSE procedure with the option METHOD=PHAC. Note, however, that none of these corrections is designed to produce "larger" standard errors. Often they do, but not always.
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
Re: PREGRESS
Thank you.
I don't have a balanced panel, hence got an error message.
OPEN DATA "XXX.xlsx"
CALANDER (panel=21,d) 2010:10:03
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 500 INDEX AB CD EF
LINREG(CLUSTER=INDEX) AB
#CONSTANT CD EF
@regpcse(method=phac,lags=1)
I don't have a balanced panel, hence got an error message.
OPEN DATA "XXX.xlsx"
CALANDER (panel=21,d) 2010:10:03
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 500 INDEX AB CD EF
LINREG(CLUSTER=INDEX) AB
#CONSTANT CD EF
@regpcse(method=phac,lags=1)
-
smukherjee01
- Posts: 21
- Joined: Mon Jul 14, 2014 9:58 am
SUR
Dear,
I want to perform SUR. The equations are working well, but getting error message during SUR.
OPEN DATA "C:\Users\Example1.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 503 DIS FREV FErr OFQ
SET PFE = FErr
SET PFR = FREV
SET FQTOTAL = (OFQ*50)/100
LINREG(CLUSTER=DIS) PFE
#CONSTANT PFR FQTOTAL
OPEN DATA "C:\Example2.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 554 Total BRLMID
SET LBRLM = log(BRLMID)
DIFF LBRLM / DLBRLM
SET OFV = (Total*50)/100
SET TDM = DLBRLM*100
LINREG(ROBUSTERRORS, LWINDOW=NEWEYWEST, LAGS=0) TDM
# Constant OFV
I want to test that the coefficient of FQTOTAL in the first equation is equal to the coefficient OFV from the second equation. Hence using the following command:
EQUATION FIRSTEQUATION PFE
#CONSTANT PFR FQTOTAL
EQUATION SECONDEQUATION TDM
# Constant OFV
GROUP COMBINING FIRSTEQUATION SECONDEQUATION
SUR(MODEL= COMBINING, EQUATE = ||2,5||)
But getting error:
## REG13. Singular Regressions - Check for Collinearity among Rows 1 to 2
Can please tell me where I am going wrong?
I want to perform SUR. The equations are working well, but getting error message during SUR.
OPEN DATA "C:\Users\Example1.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 503 DIS FREV FErr OFQ
SET PFE = FErr
SET PFR = FREV
SET FQTOTAL = (OFQ*50)/100
LINREG(CLUSTER=DIS) PFE
#CONSTANT PFR FQTOTAL
OPEN DATA "C:\Example2.xlsx"
DATA(FORMAT=XLSX,ORG=COLUMNS) 1 554 Total BRLMID
SET LBRLM = log(BRLMID)
DIFF LBRLM / DLBRLM
SET OFV = (Total*50)/100
SET TDM = DLBRLM*100
LINREG(ROBUSTERRORS, LWINDOW=NEWEYWEST, LAGS=0) TDM
# Constant OFV
I want to test that the coefficient of FQTOTAL in the first equation is equal to the coefficient OFV from the second equation. Hence using the following command:
EQUATION FIRSTEQUATION PFE
#CONSTANT PFR FQTOTAL
EQUATION SECONDEQUATION TDM
# Constant OFV
GROUP COMBINING FIRSTEQUATION SECONDEQUATION
SUR(MODEL= COMBINING, EQUATE = ||2,5||)
But getting error:
## REG13. Singular Regressions - Check for Collinearity among Rows 1 to 2
Can please tell me where I am going wrong?
Re: PREGRESS
EQUATE operates by position across equations. Use EQUATE=||2|| and arrange the regressions so the second coefficients are the ones that you want to be the same. (If you're really just interested in a test, then you would do a regular SUR and do a RESTRICT instruction---on RESTRICT, you would reference them as coefficients 2 and 5.)