Go to the first, previous, next, last section, table of contents.
The awk language has one-dimensional arrays for storing groups
of related strings or numbers.
Every awk array must have a name. Array names have the same
syntax as variable names; any valid variable name would also be a valid
array name. But you cannot use one name in both ways (as an array and
as a variable) in one awk program.
Arrays in awk superficially resemble arrays in other programming
languages; but there are fundamental differences. In awk, you
don't need to specify the size of an array before you start to use it.
Additionally, any number or string in awk may be used as an
array index.
In most other languages, you have to declare an array and specify how many elements or components it contains. In such languages, the declaration causes a contiguous block of memory to be allocated for that many elements. An index in the array must be a positive integer; for example, the index 0 specifies the first element in the array, which is actually stored at the beginning of the block of memory. Index 1 specifies the second element, which is stored in memory right after the first element, and so on. It is impossible to add more elements to the array, because it has room for only as many elements as you declared.
A contiguous array of four elements might look like this,
conceptually, if the element values are 8, "foo",
"" and 30:
+---------+---------+--------+---------+
| 8 | "foo" | "" | 30 | value
+---------+---------+--------+---------+
0 1 2 3 index
Only the values are stored; the indices are implicit from the order of
the values. 8 is the value at index 0, because 8 appears in the
position with 0 elements before it.
Arrays in awk are different: they are associative. This means
that each array is a collection of pairs: an index, and its corresponding
array element value:
Element 4 Value 30 Element 2 Value "foo" Element 1 Value 8 Element 3 Value ""
We have shown the pairs in jumbled order because their order is irrelevant.
One advantage of an associative array is that new pairs can be added
at any time. For example, suppose we add to the above array a tenth element
whose value is "number ten". The result is this:
Element 10 Value "number ten" Element 4 Value 30 Element 2 Value "foo" Element 1 Value 8 Element 3 Value ""
Now the array is sparse (i.e., some indices are missing): it has elements 1--4 and 10, but doesn't have elements 5, 6, 7, 8, or 9.
Another consequence of associative arrays is that the indices don't have to be positive integers. Any number, or even a string, can be an index. For example, here is an array which translates words from English into French:
Element "dog" Value "chien" Element "cat" Value "chat" Element "one" Value "un" Element 1 Value "un"
Here we decided to translate the number 1 in both spelled-out and numeric form--thus illustrating that a single array can have both numbers and strings as indices.
When awk creates an array for you, e.g., with the split
built-in function,
that array's indices are consecutive integers starting at 1.
(See section Built-in Functions for String Manipulation.)
Go to the first, previous, next, last section, table of contents.