Go to the first, previous, next, last section, table of contents.


Multi-dimensional Arrays

A multi-dimensional array is an array in which an element is identified by a sequence of indices, not a single index. For example, a two-dimensional array requires two indices. The usual way (in most languages, including awk) to refer to an element of a two-dimensional array named grid is with grid[x,y].

Multi-dimensional arrays are supported in awk through concatenation of indices into one string. What happens is that awk converts the indices into strings (see section Conversion of Strings and Numbers) and concatenates them together, with a separator between them. This creates a single string that describes the values of the separate indices. The combined string is used as a single index into an ordinary, one-dimensional array. The separator used is the value of the built-in variable SUBSEP.

For example, suppose we evaluate the expression foo[5,12]="value" when the value of SUBSEP is "@". The numbers 5 and 12 are converted to strings and concatenated with an `@' between them, yielding "5@12"; thus, the array element foo["5@12"] is set to "value".

Once the element's value is stored, awk has no record of whether it was stored with a single index or a sequence of indices. The two expressions foo[5,12] and foo[5 SUBSEP 12] always have the same value.

The default value of SUBSEP is the string "\034", which contains a nonprinting character that is unlikely to appear in an awk program or in the input data.

The usefulness of choosing an unlikely character comes from the fact that index values that contain a string matching SUBSEP lead to combined strings that are ambiguous. Suppose that SUBSEP were "@"; then foo["a@b", "c"] and foo["a", "b@c"] would be indistinguishable because both would actually be stored as foo["a@b@c"]. Because SUBSEP is "\034", such confusion can arise only when an index contains the character with ASCII code 034, which is a rare event.

You can test whether a particular index-sequence exists in a "multi-dimensional" array with the same operator in used for single dimensional arrays. Instead of a single index as the left-hand operand, write the whole sequence of indices, separated by commas, in parentheses:

(subscript1, subscript2, ...) in array

The following example treats its input as a two-dimensional array of fields; it rotates this array 90 degrees clockwise and prints the result. It assumes that all lines have the same number of elements.

awk '{
     if (max_nf < NF)
          max_nf = NF
     max_nr = NR
     for (x = 1; x <= NF; x++)
          vector[x, NR] = $x
}

END {
     for (x = 1; x <= max_nf; x++) {
          for (y = max_nr; y >= 1; --y)
               printf("%s ", vector[x, y])
          printf("\n")
     }
}'

When given the input:

1 2 3 4 5 6
2 3 4 5 6 1
3 4 5 6 1 2
4 5 6 1 2 3

it produces:

4 3 2 1
5 4 3 2
6 5 4 3
1 6 5 4
2 1 6 5
3 2 1 6


Go to the first, previous, next, last section, table of contents.