Information Technology Reference
In-Depth Information
Variables and Arrays
A variable-name or array-name in awk is a string of letters. An array-entry has
the format arrayName[ index ] .The index in arrayName is simply a string which is
a very flexible format, i.e. , any array is by default an associative array and not
necessarily a linear array indexed by integers. A typical example for use of an
associative array showing the power of this concept can be found in Section 12.2.2
of this chapter ( countFrequencies ). Numbers are simultaneously understood as
strings in awk . All variables or entries of an array that are used are automatically
initiated to the empty string. Any string has numerical value zero (0).
Built-In Variables
awk has a number of built-in variables some of which have already been intro-
duced. In the next few paragraphs, we shall list the most useful ones. The reader is
refered to [4, 5, 40] or the UNIX manual pages for awk for a complete listing.
FILENAME : The built-in variable FILENAME contains the name of the current input
file. awk can distinguish the standard input - as the name of the current input file.
Using a pattern such as FILENAME==fName ( cf. appendix A.2), processing by awk can
depend upon one of several input files that follow the awk program as arguments and
are being processed in the order listed from left to right ( e.g. , awk ' awkProgram '
fileOne fileTwo fileLast ). See the listing of the program setIntersection below
in Section 12.4.2.3 for a typical use of FILENAME .
FS : The built-in variable FS contains the field separator character. Default: se-
quences of blanks and tabs. For example, the variable FS should be reset to & (sep-
arator for tables in T E X), if one wants to partition the input line in regard to fields
separated by & . Such a resetting action happens often in action B matched by the
BEGIN pattern at the start of processing in an awk program.
NF : The built-in variable NF contains the number of fields in the current pattern
space (input record). This is very important in order to loop over all fields in the
pattern space using the for -loop construct of awk . A typical loop is given by:
for(counter=1;counter<=NF;counter++) { actionWith ( counter ) } .
See the listing of the program findFourLetterWords below for a typical use of NF .
Note that NF can be increased to “make room” for more fields which can be filled
with results of the current computation in the cycle.
NR : The built-in variable NR contains the number of the most recent input record.
Usually, this is the line number if the record separator character RS is not reset or NR
itself is not reassigned another value. See the listing of the program context below
for a typical use of NR .
OFS : The built-in variable OFS contains the output field separator used in print .
Default: blank. OFS is caused to be printed if a comma “ , ”isusedina print
statement. See the listing of the program firstFiveFieldsPerLine below for an
application.
ORS : The built-in variable ORS contains the output record separator string. It
is appended to the output after each print statement. Default: newline -character.
ORS can be set to the empty string through ORS="" . In that case, output lines are
concatenated. If one sets ORS="\n\n" , i.e. , two newline characters (see next section),
then the output is double-spaced. See the listing of the awk program in section 12.5.2
for an application.
Search WWH ::




Custom Search