[Chapter 11] 11.3 Patterns and Procedures

11.3 Patterns and Procedures

awk scripts consist of patterns and procedures:

pattern  { procedure }

Both are optional. If pattern is missing, { procedure } is applied to all lines; if { procedure } is missing, the matched line is printed.

11.3.1 Patterns

A pattern can be any of the following:

/regular expression/
relational expression
pattern-matching expression
BEGIN
END

Expressions can be composed of quoted strings, numbers, operators, functions, defined variables, or any of the predefined variables described later in the section "Built-in Variables."
Regular expressions use the extended set of metacharacters and are described in Chapter 6, Pattern Matching.
^ and $ refer to the beginning and end of a string (such as the fields), respectively, rather than the beginning and end of a line. In particular, these metacharacters will not match at a newline embedded in the middle of a string.
Relational expressions use the relational operators listed in the section "Operators" later in this chapter. For example, $2 > $1 selects lines for which the second field is greater than the first. Comparisons can be either string or numeric. Thus, depending on the types of data in $1 and $2, awk does either a numeric or a string comparison. This can change from one record to the next.
Pattern-matching expressions use the operators ~ (match) and !~ (don't match). See the section "Operators" later in this chapter.
The BEGIN pattern lets you specify procedures that take place before the first input line is processed. (Generally, you set global variables here.)
The END pattern lets you specify procedures that take place after the last input record is read.
In nawk, BEGIN and END patterns may appear multiple times. The procedures are merged as if there had been one large procedure.

Except for BEGIN and END, patterns can be combined with the Boolean operators || (or), && (and), and ! (not). A range of lines can also be specified using comma-separated patterns:

pattern,pattern

11.3.2 Procedures

Procedures consist of one or more commands, functions, or variable assignments, separated by newlines or semicolons, and contained within curly braces. Commands fall into five groups:

Variable or array assignments
Printing commands
Built-in functions
Control-flow commands
User-defined functions (nawk only)

11.3.3 Simple Pattern-Procedure Examples

Print first field of each line:
```
{ print $1 }
```
Print all lines that contain pattern:
```
/pattern/
```
Print first field of lines that contain pattern:
```
/pattern/ { print $1 }
```
Select records containing more than two fields:
```
NF > 2
```
Interpret input records as a group of lines up to a blank line. Each line is a single field:
```
BEGIN { FS = "\n"; RS = "" }
```
Print fields 2 and 3 in switched order, but only on lines whose first field matches the string "URGENT":
```
$1 ~ /URGENT/ { print $3, $2 }
```
Count and print the number of pattern found:
```
/pattern/ { ++x }
END { print x }
```

Add numbers in second column and print total:

{ total += $2 }
END { print "column total is", total}

Print lines that contain less than 20 characters:
```
length($0) < 20
```
Print each line that begins with Name: and that contains exactly seven fields:
```
NF == 7 && /^Name:/
```
Print the fields of each input record in reverse order, one per line:
```
{
	for (i = NF; i >= 1; i--)
		print $i
}
```


11.2 Command-Line Syntax		11.4 Built-in Variables