Go to the first, previous, next, last section, table of contents.

How Input is Split into Records

The awk language divides its input into records and fields. Records are separated by a character called the record separator. By default, the record separator is the newline character, defining a record to be a single line of text.

Sometimes you may want to use a different character to separate your records. You can use a different character by changing the built-in variable RS. The value of RS is a string that says how to separate records; the default value is "\n", the string containing just a newline character. This is why records are, by default, single lines.

RS can have any string as its value, but only the first character of the string is used as the record separator. The other characters are ignored. RS is exceptional in this regard; awk uses the full value of all its other built-in variables.

You can change the value of RS in the awk program with the assignment operator, `=' (see section Assignment Expressions). The new record-separator character should be enclosed in quotation marks to make a string constant. Often the right time to do this is at the beginning of execution, before any input has been processed, so that the very first record will be read with the proper separator. To do this, use the special BEGIN pattern (see section BEGIN and END Special Patterns). For example:

awk 'BEGIN { RS = "/" } ; { print $0 }' BBS-list

changes the value of RS to "/", before reading any input. This is a string whose first character is a slash; as a result, records are separated by slashes. Then the input file is read, and the second rule in the awk program (the action with no pattern) prints each record. Since each print statement adds a newline at the end of its output, the effect of this awk program is to copy the input with each slash changed to a newline.

Another way to change the record separator is on the command line, using the variable-assignment feature (see section Invoking awk).

awk '{ print $0 }' RS="/" BBS-list

This sets RS to `/' before processing `BBS-list'.

Reaching the end of an input file terminates the current input record, even if the last character in the file is not the character in RS.

The empty string, "" (a string of no characters), has a special meaning as the value of RS: it means that records are separated only by blank lines. See section Multiple-Line Records, for more details.

The awk utility keeps track of the number of records that have been read so far from the current input file. This value is stored in a built-in variable called FNR. It is reset to zero when a new file is started. Another built-in variable, NR, is the total number of input records read so far from all files. It starts at zero but is never automatically reset to zero.

If you change the value of RS in the middle of an awk run, the new value is used to delimit subsequent records, but the record currently being processed (and records already processed) are not affected.

Go to the first, previous, next, last section, table of contents.