reading data when list of variables is huge
reading data when list of variables is huge
Hi,
I am trying to do Principal Component Analysis. There are around 103 commodities, but the series window is taking only upto the first 70. So what should I do, how do I ensure that all variables are included before I perform the PCA. RATS is not reading the entire list of variables.
cal(m) 1981:4
open data pca.xls
data(format=xls,org=cols) 1981:4 2014:9 AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT
Also I am using RATS for the first time for PCA. So this is how I am doing. I hope this is correct.
vcv(center,matrix=r)
#AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT
@prinfactors(print) r
@prinfactors(print,values=evalues) %cvtocorr(r)
set eigen 1 2 = evalues(t)
graph(style=symbols,vlabel="Eigenvalue",hlabel="Component",nodates)
#eigen
I am trying to do Principal Component Analysis. There are around 103 commodities, but the series window is taking only upto the first 70. So what should I do, how do I ensure that all variables are included before I perform the PCA. RATS is not reading the entire list of variables.
cal(m) 1981:4
open data pca.xls
data(format=xls,org=cols) 1981:4 2014:9 AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT
Also I am using RATS for the first time for PCA. So this is how I am doing. I hope this is correct.
vcv(center,matrix=r)
#AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT
@prinfactors(print) r
@prinfactors(print,values=evalues) %cvtocorr(r)
set eigen 1 2 = evalues(t)
graph(style=symbols,vlabel="Eigenvalue",hlabel="Component",nodates)
#eigen
Re: reading data when list of variables is huge
Use $ at the end of a line which is too long to fit. See Section 1.5.6 of the Introduction.
Depending upon what you want to do, you may want the @PRINCOMP procedure rather than @PRINFACTORS.
Depending upon what you want to do, you may want the @PRINCOMP procedure rather than @PRINFACTORS.
Re: reading data when list of variables is huge
Thanks a ton. This is great help. If I put the $ sign, RATS reads the entire data set
Also I was trying the Princomp command. So this is what I did, to get the first principal component.
cal(m) 1981:4
open data pca2.xls
data(format=xls,org=cols) 1981:4 2014:9 list of variables
* dec vect[series] pcfoodprices
vcv(center,matrix=r)
#list of variables
@princomp(corr, ncomp=1) 1981:4 2014:9 pcfoodprices
#list of variables
@princomp(print)
If I run this, I get PCFOODPRICES(1) in the Series Window, along with the list of series, and when I click on that , I basically get a value of 0.596221. So is that the value of the first principal component.
Actually I am new to RATS, that's why I have basic questions.
Thank you so much. This is great help indeed.
Also I was trying the Princomp command. So this is what I did, to get the first principal component.
cal(m) 1981:4
open data pca2.xls
data(format=xls,org=cols) 1981:4 2014:9 list of variables
* dec vect[series] pcfoodprices
vcv(center,matrix=r)
#list of variables
@princomp(corr, ncomp=1) 1981:4 2014:9 pcfoodprices
#list of variables
@princomp(print)
If I run this, I get PCFOODPRICES(1) in the Series Window, along with the list of series, and when I click on that , I basically get a value of 0.596221. So is that the value of the first principal component.
Actually I am new to RATS, that's why I have basic questions.
Thank you so much. This is great help indeed.
Re: reading data when list of variables is huge
print / pcfoodprices(1)
should give you the first principal component across the full range of data.
should give you the first principal component across the full range of data.
Re: reading data when list of variables is huge
Thank you so much. It worked. This is great help.
Many Many Thanks
Many Many Thanks
Re: reading data when list of variables is huge
Dear Tom,
I am trying to write a paper on effectiveness of monetary policy. my Y is the short-term interest rate (operational target of the CB denoted by R). X is a big matrix of data. X1 is a big matrix of slow moving variables. Once i have applied the @princomp command on slow and overall variables and got the number of components I am looking for, say 3 in this case, how can i do the following:
1. look at how much of the total variation in the data is each component explaining?
2. when we have taken out principal components from slow and overall variables, that is for example we took out C-overall and C-slow and they are 3 each and Y has 3 series of interest lets say Y = (IPI, P, R). so to remove dependence do i have to do the following:
C-overall-1st = C-slow-1st + C-slow-2nd + C-slow-3rd+ B*Y + et
that is for each component this regression has to be run and C-overall-1st - B*Y has to be subtracted?
Regards,
Ateeb
I am trying to write a paper on effectiveness of monetary policy. my Y is the short-term interest rate (operational target of the CB denoted by R). X is a big matrix of data. X1 is a big matrix of slow moving variables. Once i have applied the @princomp command on slow and overall variables and got the number of components I am looking for, say 3 in this case, how can i do the following:
1. look at how much of the total variation in the data is each component explaining?
2. when we have taken out principal components from slow and overall variables, that is for example we took out C-overall and C-slow and they are 3 each and Y has 3 series of interest lets say Y = (IPI, P, R). so to remove dependence do i have to do the following:
C-overall-1st = C-slow-1st + C-slow-2nd + C-slow-3rd+ B*Y + et
that is for each component this regression has to be run and C-overall-1st - B*Y has to be subtracted?
Regards,
Ateeb