FORMAT=FORTRAN

This is an Unlabeled text format for input and output which uses FORTRAN I/O descriptors. It’s available on all versions of RATS. It’s available in case other methods for dealing with a poorly-formatted text file are inconvenient. However, we would recommend you look first at importing the file into a spreadsheet program and using its line parsing tools.

RATS Instructions

data(format='Fortran format string')	read series using FORTRAN format
copy(format='Fortran format string')	write series using FORTRAN format
read(format='Fortran format string')	read data using FORTRAN format
write(format='Fortran format string')	write data using FORTRAN format

ORG options are required on DATA and COPY.

Interface Operations

None

Details

FORTRAN format allows you to use the standard FORTRAN I/O formatting codes to indicate the format of the data file. You probably shouldn’t try using it unless you are fairly comfortable with FORTRAN formats (described briefly below), and the other methods above are cumbersome.

With FORTRAN formats you can tell RATS exactly how the data are organized when using DATA or READ, or exactly how you want the data to be formatted when using COPY or WRITE. For example, when used with COPY, FORMAT=FREE uses a very wide field for each number so that it can reasonably handle very different magnitudes. The result is that you can sometimes get a lot of extra digits. This takes up space (so you see fewer numbers at a glance) and masses of digits can be hard to read.

If you are unfamiliar with FORTRAN, a brief description is in order. The basic format elements for real numbers take the forms Fw.d, Ew.d and Gw.d. RATS (which only simulates FORTRAN I/O) also allows you to use the Iw format for input.

Fw.d	For numbers with a fixed decimal place: a total of w positions (total digits to the left and the right of the decimal) with d digits to the right of the decimal. Very large and very small numbers will not fit an F format. For example, the number 1000000. cannot be printed as F15.8.
Ew.d	For numbers in scientific notation. When writing files, this format is useful because it can handle a number of any magnitude, but it can be difficult to pick out large and small values at a single glance. For example, it takes a bit of work to see that the first of 1.343E-02 and 8.547E-03 is the larger.
Gw.d	This is a mixture of the F and E formats. If a number can reasonably be printed with the F format, that is used. All others (very large and very small) are displayed in the E format.
Iw	For integer numbers. If there is a fractional part to the number, it is ignored.
Aw	This is used (by COPY) only for the date strings. It prints a character string, left justified, in a field of width w.
wX	Indicates w spaces
/	Indicates a skip to the next line

In the description of a line, the fields are separated by commas. You can prefix a field descriptor by a number of repetitions. For instance, F6.2,F6.2,F6.2 can be shortened to 3F6.2.

The FORMAT Option

The DATA, COPY, READ, and WRITE instructions all support the FORTRAN format. The FORMAT option is the same for all four instructions:

FORMAT="( format string )"

Enclose the format string in quotes, for instance, FORMAT="(11X,4F15.7)". The format string must fit on a single line. If you need an extremely long string, you may want to put the string on its own line:

data(org=col,format= $

"(f8.5,2x,f8.4,2x,f8.5,2x,f8.3,2x,f8.5,2x,f8.5)" ) $

1947:1 1979:4 wage interest stocks mortgage charitable misc

With ORG=ROWS

Each entire series is read with the indicated format: below is a RATS instruction and its FORTRAN language equivalent:

data(format="(11x,4f15.7)") 1 100 gnp m1

read(n,1000) (gnp(i),i=1,100)

read(n,1000) (m1(i) ,i=1,100)

1000 format(11x,4f15.7)

If you have blank lines separating the series, you may have to read the first series with one format, then use a format such as (//(11X,4F15.7)) for the remaining series: this particular format will skip two lines before reading a series.

With ORG=COLS

The equivalent of a separate FORTRAN read statement is used for each observation: this is a sample RATS instruction and its FORTRAN equivalent. (You must delete any header lines at the top of the file).

data(org=col,format="(f9.4,2x,f10.3,2x,f9.4)") 1 100 gnp m1 ipd

do 100 i=1,100

read(n,1000) gnp(i),m1(i),ipd(i)

100 continue

1000 format(f9.4,2x,f10.3,2x,f9.4)

Mixed Character and Numeric Data

FORTRAN format codes can also be useful if you have a text file containing both numeric data and text labels. You cannot store character information in SERIES variables, or use DATA to read in character/label information, but if the data are well–organized, it may be possible to read the numeric data into series and the text data into LABEL variables. You will probably need to read the file twice: once to get the numeric data and once to get the character data.

Suppose you have the following data file:

1.045	ME	2.0	1.0
2.3210	MS	2.0	2.0
1.8930	MN	2.0	3.0

You could use the following program to read this data. First, a READ instruction using a FORMAT option skips the numeric data and reads only the two-character label. The data file is rewound (positioned to the top), and then the data are read using a DATA instruction which skips over the character information to get the numbers. Finally, a simple loop is used to display formatted output:

all 3

declare vect[labels] states(3)

open data mixed.dat

read(unit=data,format="(8x,a2)") states

rewind data

data(format="(f6.4,6x,2f6.1)",org=col) / x y z

do row=1,3

display states(row) @10 x(row) y(row) z(row)

end

Missing Data

You can use the codes (such as NA) for missing values used by other text files. You can also leave a blank area where the missing values should be and use BLANK=MISSING on the DATA instruction. This will interpret the blank area as a missing value (as opposed to a standard practice of reading a zero).

Using FORTRAN formats: An Example

The following data set is used in Spreadsheet and Delimited Text. We show here how to use FORTRAN format to simplify the process of reading the data.

FIRM A

1996 11.3 11.6 10.9 12.3

1997 13.0 12.8 11.5 12.5

1998 12.9 13.0 13.2 13.6

FIRM B

1996 22.5 21.9 24.3 25.6

1997 21.9 21.8 22.6 23.5

1998 22.5 25.0 24.5 25.4

The data rows can be read with (12X,4F9.1). The problem is the other rows: there is only one row preceding the first series (the row with “FIRM A”), but two (the blank plus “FIRM B”) preceding the second. The simplest way to handle this is to add a blank line at the beginning of the file and use

cal(q) 1996:1

all 1998:4

open data test.dat

data(format="(//(12x,4f9.1))",org=rows) / firma firmb

print / firma firmb

Note that we use PRINT to verify that the data has been read properly. How did we come up with '(//(12X,4F9.1))'? First, the two slashes tell RATS to skip two lines before each block of data. Next, for a data block where the numbers are regularly spaced (as they are here), just determine the position of the last digit in the first data value on the line (column 21 in this case) and the field width of the data (here, 9, which is the distance from the end of one number to the end of the next). The 12in 12X is just the number of leading positions (21–9=12) to be skipped.

The 4F9.1 indicates that the fields are each nine characters wide (including leading blanks) with 1 digit after the decimal point, and that this format is repeated 4 times per row. After skipping two lines, RATS will read data into FIRMA using the 12X,4F9.1 format until it has read the requested number of data points (12 in this case). It then skips two more lines and again uses the 12X,4F9.1 format to read 12 data points into FIRMB.

An alternative if you either can’t, or don’t want to, alter the original file is to use separate instructions so that you can apply different formats in succession:

data(format="(/(12x,4f9.1))",org=rows) / firma

data(format="(//(12x,4f9.1))",org=rows) / firmb