AWK(1) | General Commands Manual | AWK(1) |
awk
—
awk |
[-F fs]
[-v
var= value]
[-safe ]
[-d [N]]
[prog | -f
progfile] file ... |
awk |
-version |
awk
is the Bell Labs' implementation of the AWK
programming language as described in the The AWK Programming
Language by A.V. Aho, B.W. Kernighan,
P.J. Weinberger.
awk
scans each input
file for lines that match any of a set of patterns
specified literally in prog or in one or more files
specified as -f
progfile. With
each pattern there can be an associated action that will be performed when a
line of a file matches the pattern. Each line is
matched against the pattern portion of every pattern-action statement; the
associated action is performed for each matched pattern. The file name
- means the standard input. Any
file of the form
var=
value
is treated as an assignment, not a filename, and is executed at the time it
would have been opened if it were a filename. The option
-v
followed by
var=
value
is an assignment to be done before prog is executed;
any number of -v
options may be present. The
-F
fs option defines the input
field separator to be the regular expression fs.
The options are as follows:
-d
[N]-f
filename-f
options may be
specified.-F
fs-mr
NNN, -mf
NNN-safe
system
() make
the program abort (with a warning message).-v
var=
value-v
options may be present.-version
awk
version on standard output and
exit.An input line is normally made up of fields separated by white
space, or by the regular expression the built-in variable
FS is set to. If FS is null, the
input line is split into one field per character. The fields are denoted
$
1,
$
2, ..., while
$
0 refers to the entire line.
Setting any other field causes the re-evaluation of
$
0 Assigning to
$
0 resets the values of all
other fields and the NF built-in variable.
A pattern-action statement has the form
{
action
}
A missing {
action
}
means print the line; a missing pattern always
matches. Pattern-action statements are separated by newlines or
semicolons.
An action is a sequence of statements. Statements are terminated
by semicolons, newlines or right braces. An empty
expression-list stands for
$
0. String constants are
quoted ""
, with the usual C escapes
recognized within. Expressions take on string or numeric values as
appropriate, and are built using the
Operators (see next subsection).
Variables may be scalars, array elements (denoted
x[i]) or fields. Variables are
initialized to the null string. Array subscripts may be any string, not
necessarily numeric; this allows for a form of associative memory. Multiple
subscripts such as
[i,
j,
k]
are permitted; the constituents are concatenated, separated by the value of
SUBSEP.
awk
operators, in order of decreasing precedence, are:
(
...)
$
++
--
^
**
form is also supported, and
**=
for the assignment operator).<
>
<=
>=
!=
==
~
!~
in
&&
||
?:
?
expr2
:
expr3. If
expr1 is true, the result value is
expr2, otherwise it is expr3.
Only one of expr2 and expr3 is
evaluated.= +=
-=
*=
/= %= ^=
if
(
expression)
statement [else
statement]while
(
expression)
statementfor
(
expression;
expression;
expression)
statementfor
(
var in
array)
statementdo
statement while
(
expression)
break
continue
{
[statement ...] }
=
expressionreturn
[expression]next
nextfile
delete
array[
expression]
delete
arrayexit
[expression]close
(expr)fflush
(expr)getline
[var]$
0 if
var is not specified) to the next input record from
the current input file. getline
returns 1 for a
successful input, 0 for end of file, and -1 for an error.getline
[var] <
file$
0 if
var is not specified) to the next input record from
the specified file file.| getline
getline
; each call of
getline
returns the next line of output from
expr.print
[expr-list] [redirection]printf
format[,
expr-list] [redirection]Both print
and
printf
statements write to standard output by
default. The output is written to the file or pipe specified by
redirection if one is supplied, as follows:
>
file,
>>
file, or
|
expr. Both
file and expr may be literal
names or parenthesized expressions; identical string values in different
statements denote the same open file. For that purpose the file names
/dev/stdin, /dev/stdout, and
/dev/stderr refer to the program's
stdin, stdout, and
stderr respectively (and are unrelated to the
fd(4) devices of the same
names).
atan2
(x,
y)/
y in
radians. See also
atan2(3).cos
(expr)exp
(expr)int
(expr)log
(expr)rand
()sin
(expr)sqrt
(expr)srand
([expr])rand
()) and
returns the previous seed.gensub
(r,
s, h
[t]);g
’ or
‘G
’, then replace all matches of
r with s. Otherwise,
h is a number indicating which match of
r to replace. If no t is
supplied, $
0 is used
instead. Unlike sub
() and
gsub
(), the modified string is returned as the
result of the function, and the original target is not
changed. Note that the ‘\n
’
sequences within replacement string s supported by
GNU awk
are not supported at
this moment.gsub
(r,
s [t]);sub
() except that all occurrences of the
regular expression are replaced; sub
() and
gsub
() return the number of replacements.index
(s,
t)length
[([string])]$
0 if no argument.match
(s,
r)split
(s,
a [fs]);[1]
,
a[2]
, ...,
a[
n]
,
and returns n. The separation is done with the
regular expression fs or with the field separator
FS if fs is not given. An
empty string as field separator splits the string into one array element
per character.sprintf
(fmt,
expr, ...)sub
(r,
s [t]);$
0 is used.substr
(s,
m [n]);tolower
(str)toupper
(str)awk
provides the following two functions for
obtaining time stamps and formatting them:
systime
()strftime
([format
[timestamp]]);systime
(). If
timestamp is missing, current time is used. If
format is missing, a default format equivalent to
the output of date(1) would be
used. See the specification of ANSI C
strftime(3) for the format
conversions which are supported.system
(cmd)! ||
&&
) of regular expressions and relational expressions. Regular
expressions are as in egrep(1).
Isolated regular expressions in a pattern apply to the entire line. Regular
expressions may also occur in relational expressions, using the operators
~
and !~
.
/
re/
is
a constant regular expression; any string (constant or variable) may be used
as a regular expression, except in the position of an isolated regular
expression in a pattern.
A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second.
A relational expression is one of the following:
in
array-name(
expr,
expr,
... ) in
array-namewhere a relop is any of the six relational
operators in C, and a matchop is either
~
(matches) or !~
(does not
match). A conditional is an arithmetic expression, a relational expression,
or a Boolean combination of these.
The special patterns BEGIN
and
END
may be used to capture control before the first
input line is read and after the last. BEGIN
and
END
do not combine with other patterns.
If an awk program consists of only actions with the pattern
BEGIN
, and the BEGIN
action
contains no getline
statement, awk exits without
reading its input when the last statement in the last
BEGIN
action is executed. If an awk program consists
of only actions with the pattern END
or only actions
with the patterns BEGIN
and
END
, the input is read before the statements in the
END
actions are executed.
"%.6g"
)-F
fs."%.6g"
)match
(); 0 if no match.match
(); -1 if no
match.034
)function foo(a, b, c) { ...; return x }
Parameters are passed by value if scalar and by reference if array name; functions may be called recursively. Parameters are local to the function; all other variables are global. Thus local variables may be created by providing excess parameters in the function definition.
length
() defaults
to $
0 and the empty parens can
also be omitted in this case:
length > 72
Print first two fields in opposite order:
{ print $2, $1 }
Same, with input fields separated by comma and/or blanks and tabs:
BEGIN { FS = ",[ \t]*|[ \t]+" } { print $2, $1 }
Add up first column, print sum and average:
{ s += $1 } END { print "sum is", s, "average is", s/NR }
Print all lines between start/stop pairs:
/start/, /stop/
Simulate echo(1):
BEGIN { for (i = 1; i < ARGC; ++i) printf("%s%s", ARGV[i], i==ARGC-1?"\n":" ") }
Another way to do the same that demonstrates field assignment and
$
0 re-evaluation:
BEGIN { for (i = 1; i < ARGC; ++i)
$i = ARGV[i]; print }
Print an error message to standard error:
{ print "error!" > "/dev/stderr" }
A.V. Aho, B.W. Kernighan, P.J. Weinberger, The AWK Programming Language, Addison-Wesley, 1988. ISBN 0-201-07981-X
AWK Language Programming, Edition 1.0, published by the Free Software Foundation, 1995
nawk
has been the default system
awk
since NetBSD 2.0,
replacing the previously used GNU awk
.
The scope rules for variables in functions are a botch; the syntax is worse.
Only eight-bit characters sets are handled correctly.
July 5, 2022 | NetBSD 10.1 |