Changing Data Frequencies

The DATA instruction can automatically convert data from one frequency to another. For this to work:

•the source file must contain dates that RATS can process or you must be able to provide the source date information yourself.

•you must set the CALENDAR instruction to the desired (new) frequency. You can do this by typing in CALENDAR directly, by using the Data/Graphics>Calendar operation, or by setting the “Target Dates” field in the Data Wizard.

Given a mismatch between the frequency of the data on the file and the CALENDAR seasonal, RATS will automatically compact or expand the data to match the CALENDAR setting. This works for any of the file formats for which RATS can process dates (any of the Time Series Database formats and most of the Labeled Table formats).

To compact from a higher frequency to a lower frequency, just follow these steps:

1.As noted above, make sure the source data file contains valid date information at the original (higher) frequency. See
"No Dates on Source File" for tips if you don’t already have date information on a file.

2.If you’re using a Data Wizard, just make sure that the “File Dates” field accurately reflects the higher frequency of the source data. Then, set the “Target Dates” fields to the desired lower frequency. Click on OK to read the data.

If you are typing in commands directly, set the CALENDAR to the target (lower) frequency. For example, if you are compacting monthly data to a quarterly frequency, specify a quarterly CALENDAR. Then use OPEN DATA and DATA to read the file.

RATS will automatically compact the data to match the CALENDAR frequency using the method specified by the COMPACT or SELECT option, or the “Compact by” field on the Data Wizard. The default is COMPACT=AVERAGE.

COMPACT and SELECT

The COMPACT and SELECT options allow you to select from several compaction methods. Note that the two options are mutually exclusive. If you try to use both, RATS will honor the SELECT choice. The choices are:

compact=[average]/sum/geometric/first/last/maximum/minimum

AVERAGE	simple average of the subperiods
SUM	sum of the subperiods
GEOMETRIC	geometric average of the subperiods
FIRST/LAST	first or last entry of each subperiod, respectively
MAX/MIN	maximum value or minimum value from each subperiod

select=subperiod to select

compacts data by selecting a single subperiod from within each period.

Suppose you have a quarterly CALENDAR and you want to read a monthly data series. DATA with SELECT=3 will read in the third month from each quarter (that is, the March, June, September, and December observations from each year). With the default option of COMPACT=AVERAGE, each quarterly value will be the average of the three months which make up the quarter.

It is more complicated when you move between dissimilar frequencies: weekly to monthly, for instance. If you use SELECT=2, it will select the second full week within the month. Suppose you have the weekly observations shown below. Since weekly data are classified in RATS according to the end of the week, March 13, and not March 6, will end a full week for this data set. If you compact data using the other methods, RATS will give 6/7 of the March 6 value to March, and 1/7 to February. If the input data are:

Week ending:	March 6	March 13	March 20	March 27	April 3
	15.0	11.0	13.0	18.0	20.0

the value for March (with COMPACT=AVERAGE) is:

\((6/7 \times 15.0 + 11.0 + 13.0 + 18.0 + 4/7 \times 20.0)/(6/7 + 3 + 4/7) = 14.97\)

Expanding Data to Higher Frequencies

The steps are the same as those described above.

If RATS detects that the data on the file being read are at a lower frequency than the current CALENDAR setting or Data Wizard target frequency, it will automatically expand the data to the higher frequency by setting each subperiod equal to the value for the full period. For example, if you set a quarterly CALENDAR and read in annual data, RATS will set the value of each quarter in a given year to the annual value for that year.

For dissimilar frequencies, the value of a period which crosses two periods of the lower frequency data is a weighted average of the two values. For instance, in moving from monthly to weekly, the value for the week ending March 6 will be:

\(1/7 \times \text{February} + 6/7 \times \text{March}\)

For a more complex interpolation, you would first read the data as described above to expand to the higher frequency, and then apply the desired interpolation routine to the expanded data. RATS includes several interpolation procedures you can use, but we primarily recommend the @DISAGGREGATE procedure, as it provides options for choosing from several models and techniques.

The following example reads in quarterly GDP data at a monthly frequency. It uses @DISAGGREGATE to produce a monthly series from the “step” function that DATA generates. Since GDP is ordinarily quoted in annual rates, we want to maintain the average at the higher frequency; if we had a series that quoted in totals (sales figures, for instance), we would use MAINTAIN=SUM.

calendar(m) 1947

open data haversample.rat

data(format=rats) 1947:1 2006:4 gdp

set trend = t

@disaggregate(factor=3,model=loglin,maintain=average) gdp / gdpm

# trend

No Dates on Source File?

If you have a data set that you want to compact or expand, but your data file doesn’t include date information, you can either

•add dates to the original file directly (see "Adding Dates to a Spreadsheet" for suggestions if using a spreadsheet), or

•use the CALENDAR option on DATA to describe the date scheme of the source file. This is created using CALENDAR with the SAVE option.

For example, suppose the quarterly GDP data from the previous example was provided in a plain text file which we’ll call gdpnodates.txt. You know that the data are quarterly, and start in 1947Q1, but the text file doesn’t contain any date information. The following defines a CALENDAR scheme for the quarterly data, then reads that into a monthly workspace CALENDAR scheme. It then uses @DISAGGREGATE to take the crudely expanded data to a better estimate of monthly GDP.

calendar(q,save=q1947cal) 1947

open data gdpnodates.txt

calendar(m) 1947

data(format=free,org=columns,calendar=q1947cal) 1947:1 2006:12 gdp

set trend = t

@disaggregate(factor=3,model=loglin,maintain=average) gdp / gdpm

# trend