sendmailSearch this book
Previous: 10.1 Why Preprocess?Chapter 10
Rule Set 3
Next: 10.3 Missing Addresses

10.2 Rule Set 3

In its initial form, rule set 3 looks like this:

S3 # preprocessing for all rule sets
R$* < $* > $*   $2             basic RFC822 parsing

As with rule set 0, the definition of rule set 3 begins with the S configuration command. The S character must begin a line, and the 3 must follow with no intervening nonspace characters.

The only rule in rule set 3 is composed of three parts, each separated from the others by one or more tab characters:

R$* < $* > $*   $2             basic RFC822 parsing
 -^           -^  -^       -^      -^
 LHS        tabs  RHS   tabs     comment

Note that we will now separate the tokens in the LHS with spaces (not tabs) to make the LHS easier to understand. Spaces always separate tokens, yet are never themselves tokens.

10.2.1 The LHS

The wildcard operator in this LHS, the $*, is different from the $+ wildcard operator that you saw in rule set 0. Recall that the $+ wildcard operator matches one or more tokens in the workspace. To review, consider the LHS rule:

$+ @ $+

This LHS easily matches an address like in the workspace:

workspace                     LHS
you               $+       <- match one or more              
@                 @        <- match exactly              
here              $+       <- match one              
.                          -v     or more

This same LHS, however, does not match an address like

workspace                  LHS
@               $+       <- match one              
here                     -v     or more              
                @        <- match exactly, fails!

Because the $+ wildcard operator needs to match one or more tokens, it fails when there is nothing in front of the @.

The $* wildcard operator is just like the $+ wildcard operator, except that it will match nothing (zero tokens). If the LHS had used $* instead of $+, an address like would be matched:

workspace                   LHS
                 $*       <- match zero or more (matches zero)              
@                @        <- match exactly              
here             $*       <- match zero              
.                         -v     or more

The LHS in rule set 3 matches anything or nothing, provided that there is a pair of angle brackets in the workspace somewhere:

R$* < $* > $*   $2             basic RFC822 parsing

For example, consider an address that might be given to sendmail by your MUA:

Your Fullname <>

This address is tokenized and placed into the workspace. That workspace is then compared to the LHS:

workspace                   LHS
Your             $*       <- match zero              
Fullname                  -v     or more              
<                <        <- match exactly              
you              $*       <- match zero              
@                         -v     or more
>                >        <- match exactly
                 $*       <- match zero or more

10.2.2 The RHS

Recall that the objective of rule set 3 is to strip everything but the address part (the text between the angle brackets). That stripping is accomplished by rewriting the workspace using the $2 positional operator in the RHS:

R$* < $* > $*   $2             basic RFC822 parsing
                strip all but the address

Remember, when a $digit appears in the RHS, that digit is used as a count into the wildcard operators of the LHS.

$* < $* > $*
-^    -^    -^
$1   $2   $3

$1 refers to the first $*, $2 refers to the second, and $3 to the third. Comparing this ordering of operators to the test address, you see

workspace            LHS                                                RHS
Your         $*    <- match zero          $1       
Fullname           -v      or more              
<            <     <- match exactly              
you          $*    <- match zero          $2         
@                  -v      or more
>            >     <- match exactly
             $*    <- match zero or more   $3

This illustrates that the middle (second) $* matches the part of the workspace. When the RHS rewrites the workspace, it does so by copying the tokens matched by the second wildcard operator (specified in the RHS with the $2 positional operator).

10.2.3 Test Rule Set 3

Take a few moments to experiment. Observe the transformation of a user-specified address into one that sendmail can use. Add the following new rule set 3 to the rule sets in the file:

S3 # preprocessing for all rule sets                                  <- new
R$* < $* > $*       $2                      basic RFC822 parsing      <- new

Now run sendmail again. Up to now, you have been testing rule set 0, so you have specified a 0 following the > prompt. Instead, you will now specify a 3 because you are testing rule set 3:

% ./sendmail -bt
ADDRESS TEST MODE (ruleset 3 NOT automatically invoked)
Enter <ruleset> <address>
> 3 Your Fullname <you@here>
rewrite: ruleset  3   input: Your Fullname < you @ here >
rewrite: ruleset  3 returns: you @ here

As expected, the new rule causes everything except the "good" email address, the address between the angle brackets, to be thrown away.

Before we improve rule set 3, take a few moments to experiment. Experiment by putting the fullname last. Try omitting the email address between the angle brackets. Try nesting angle brackets in an address, like <a<b>c>.

As a closing note, recall that sendmail does the minimum matching possible when comparing operators to the workspace. Although $*, for example, can match zero or more, it prefers to match zero if possible and, if not, to match the fewest tokens possible. A LHS of $*@$+, for example, will match as shown in Table 10.1.

Table 10.1: What $* in $*@$+ Matches for Different Addresses
Address$* matches@$+

Expecting operators to match more than they do can cause you to misunderstand the effect of rules.

Previous: 10.1 Why Preprocess?sendmailNext: 10.3 Missing Addresses
10.1 Why Preprocess?Book Index10.3 Missing Addresses