Sendmail Installation and Operation Guide

Eric Allman, Gregory Neil Shapiro, Claus Assmann - Sendmail Inc.

back to top 5.1. R and S -- Rewriting Rules 

5.1.1. The left hand side
5.1.2. The right hand side
5.1.3. Semantics of rewriting rule sets
5.1.4. Ruleset hooks
5.1.5. IPC mailers

The core of address parsing are the rewriting rules. These are an ordered production system. Sendmail scans through the set of rewriting rules looking for a match on the left hand side (LHS) of the rule. When a rule matches, the address is replaced by the right hand side (RHS) of the rule.

There are several sets of rewriting rules. Some of the rewriting sets are used internally and must have specific semantics. Other rewriting sets do not have specifically assigned semantics, and may be referenced by the mailer definitions or by other rewriting sets.

The syntax of these two commands are:

Sn

Sets the current ruleset being collected to n.

If you begin a ruleset more than once it appends to the old definition.

Rlhs <TAB> rhs <TAB> optional comment
Rlhs   rhs   comments

The fields must be separated by at least one tab character; there may be embedded spaces in the fields. The lhs is a pattern that is applied to the input. If it matches, the input is rewritten to the rhs. The comments are ignored.

Macro expansions of the form $x are performed when the configuration file is read. A literal $ can be included using $$. Expansions of the form $&x are performed at run time using a somewhat less general algorithm. This is intended only for referencing internally defined macros such as $h that are changed at runtime.

back to top 5.1.1. The left hand side 

The left hand side of rewriting rules contains a pattern. Normal words are simply matched directly. Metasyntax is introduced using a dollar sign. The metasymbols are:

symbol description
$* Match zero or more tokens
$+ Match one or more tokens
$- Match exactly one token
$=x Match any phrase in class x
$~x Match any word not in class x

If any of these match, they are assigned to the symbol $n for replacement on the right hand side, where n is the index in the LHS. For example, if the LHS:

$-:$+

is applied to the input:

UCBARPA:eric

the rule will match, and the values passed to the RHS will be:

$1    UCBARPA
$2    eric

Additionally, the LHS can include $@ to match zero tokens. This is not bound to a $n on the RHS, and is normally only used when it stands alone in order to match the null input.

back to top 5.1.2. The right hand side 

When the left hand side of a rewriting rule matches, the input is deleted and replaced by the right hand side. Tokens are copied directly from the RHS unless they begin with a dollar sign. Metasymbols are:

symbol description
$n Substitute indefinite token n from LHS
$[name$] Canonicalize name
$(map key $@arguments $:default $) Generalized keyed mapping function
$>n "Call" ruleset n
$#mailer Resolve to mailer
$@host Specify host
$:user Specify user

The $n syntax substitutes the corresponding value from a $+, $-, $*, $=, or $~ match on the LHS. It may be used anywhere.

A host name enclosed between $[ and $] is looked up in the host database(s) and replaced by the canonical name[14]. For example,

"$[ftp$]" might become "ftp.CS.Berkeley.EDU" and

"$[[128.32.130.2]$]" would become "vangogh.CS.Berkeley.EDU".

Sendmail recognizes its numeric IP address without calling the name server and replaces it with its canonical name.

The $( ... $) syntax is a more general form of lookup; it uses a named map instead of an implicit map. If no lookup is found, the indicated default is inserted; if no default is specified and no lookup matches, the value is left unchanged. The arguments are passed to the map for possible use.

The $>n syntax causes the remainder of the line to be substituted as usual and then passed as the argument to ruleset n. The final value of ruleset n then becomes the substitution for this rule. The $> syntax expands everything after the ruleset name to the end of the replacement string and then passes that as the initial input to the ruleset. Recursive calls are allowed. For example,

$>0 $>3 $1

expands $1, passes that to ruleset 3, and then passes the result of ruleset 3 to ruleset 0.

The $# syntax should only be used in ruleset zero, a subroutine of ruleset zero, or rulesets that return decisions (e.g., check_rcpt). It causes evaluation of the ruleset to terminate immediately, and signals to sendmail that the address has completely resolved. The complete syntax for ruleset 0 is:

$#mailer $@host $:user

This specifies the {mailer, host, user} 3-tuple necessary to direct the mailer. If the mailer is local the host part may be omitted[15]. The mailer must be a single word, but the host and user may be multi-part. If the mailer is the built-in IPC mailer, the host may be a colon-separated list of hosts that are searched in order for the first working address (exactly like MX records). The user is later rewritten by the mailer-specific envelope rewriting set and assigned to the $u macro. As a special case, if the mailer specified has the F=@ flag specified and the first character of the $: value is "@", the "@" is stripped off, and a flag is set in the address descriptor that causes sendmail to not do ruleset 5 processing.

Normally, a rule that matches is retried, that is, the rule loops until it fails. A RHS may also be preceded by a $@ or a $: to change this behavior. A $@ prefix causes the ruleset to return with the remainder of the RHS as the value. A $: prefix causes the rule to terminate immediately, but the ruleset to continue; this can be used to avoid continued application of a rule. The prefix is stripped before continuing.

The $@ and $: prefixes may precede a $> spec; for example:

R$+     $: $>7 $1

matches anything, passes that to ruleset seven, and continues; the $: is necessary to avoid an infinite loop.

Substitution occurs in the order described, that is, parameters from the LHS are substituted, hostnames are canonicalized, "subroutines" are called, and finally $#, $@, and $: are processed.

back to top 5.1.3. Semantics of rewriting rule sets 

There are six rewriting sets that have specific semantics. Five of these are related as depicted by figure 1.

                        +---+
                     -->| 0 |-->resolved address
                    /   +---+
                   /            +---+   +---+
                  /        ---->| 1 |-->| S |--
           +---+ / +---+  /     +---+   +---+  \    +---+
    addr-->| 3 |-->| D |--                      --->| 4 |-->msg
           +---+   +---+  \     +---+   +---+  /    +---+
                            --->| 2 |-->| R |--
                                +---+   +---+


                             
                Figure 1 -- Rewriting set semantics

              D -- sender domain addition
              S -- mailer-specific sender rewriting
              R -- mailer-specific recipient rewriting

Ruleset three should turn the address into "canonical form." This form should have the basic syntax:

local-part@host-domain-spec

Ruleset three is applied by sendmail before doing anything with any address.

If no "@" sign is specified, then the host-domain-spec may be appended (box "D" in Figure 1) from the sender address (if the C flag is set in the mailer definition corresponding to the sending mailer).

Ruleset zero is applied after ruleset three to addresses that are going to actually specify recipients. It must resolve to a {mailer, host, address} triple. The mailer must be defined in the mailer definitions from the configuration file. The host is defined into the $h macro for use in the argv expansion of the specified mailer.

Rulesets one and two are applied to all sender and recipient addresses respectively. They are applied before any specification in the mailer definition. They must never resolve.

Ruleset four is applied to all addresses in the message. It is typically used to translate internal to external form.

In addition, ruleset 5 is applied to all local addresses (specifically, those that resolve to a mailer with the F=5 flag set) that do not have aliases. This allows a last minute hook for local names.

back to top 5.1.4. Ruleset hooks 

5.1.4.1. s_check_relay
5.1.4.2. s_check_mail
5.1.4.3. s_check_rcpt
5.1.4.4. s_check_data
5.1.4.5. s_check_compat
5.1.4.6. s_check_eoh
5.1.4.7. s_check_etrn
5.1.4.8. s_check_expn
5.1.4.9. s_check_vrfy
5.1.4.10. s_trust_auth
5.1.4.11. s_tls_client
5.1.4.12. s_tls_server
5.1.4.13. s_tls_rcpt
5.1.4.14. s_srv_features
5.1.4.15. s_try_tls
5.1.4.16. s_authinfo
5.1.4.17. s_queuegroup

A few extra rulesets are defined as "hooks" that can be defined to get special features. They are all named rulesets. The "check_*" forms all give accept/reject status; falling off the end or returning normally is an accept, and resolving to $#error is a reject. Many of these can also resolve to the special mailer name $#discard; this accepts the message as though it were successful but then discards it without delivery. Note, this mailer cannot be chosen as a mailer in ruleset 0. Note also that all "check_*" rulesets have to deal with temporary failures, especially for map lookups, themselves, i.e., they should return a temporary error code or at least they should make a proper decision in those cases.

back to top 5.1.4.1. check_relay 

(See also cf/README: Anti-Spam Configuration Control)

The check_relay ruleset is called after a connection is accepted by the daemon. It is not called when sendmail is started using the -bs option. It is passed

client.host.name $| client.host.address

where $| is a metacharacter separating the two parts. This ruleset can reject connections from various locations. Note that it only checks the connecting SMTP client IP address and hostname. It does not check for third party message relaying. The check_rcpt ruleset discussed below usually does third party message relay checking.

back to top 5.1.4.2. check_mail 

(See also cf/README: Anti-Spam Configuration Control)

The check_mail ruleset is passed the user name parameter of the SMTP MAIL command. It can accept or reject the address.

back to top 5.1.4.3. check_rcpt 

(See also cf/README: Anti-Spam Configuration Control)

The check_rcpt ruleset is passed the user name parameter of the SMTP RCPT command. It can accept or reject the address.

back to top 5.1.4.4. check_data 

The check_data ruleset is called after the SMTP DATA command, its parameter is the number of recipients. It can accept or reject the command.

back to top 5.1.4.5. check_compat 

(See also cf/README: Anti-Spam Configuration Control)

The check_compat ruleset is passed

sender-address $| recipient-address

where $| is a metacharacter separating the addresses. It can accept or reject mail transfer between these two addresses much like the checkcompat() function.

back to top 5.1.4.6. check_eoh 

(See also cf/README: Anti-Spam Configuration Control)

The check_eoh ruleset is passed

number-of-headers $| size-of-headers

where $| is a metacharacter separating the numbers. These numbers can be used for size comparisons with the arith map. The ruleset is triggered after all of the headers have been read. It can be used to correlate information gathered from those headers using the macro storage map. One possible use is to check for a missing header. For example:

Kstorage macro
HMessage-Id: $>CheckMessageId

SCheckMessageId
# Record the presence of the header
R$*            $: $(storage {MessageIdCheck} $@ OK $) $1
R< $+ @ $+ >   $@ OK
R$*            $#error $: 553 Header Error

Scheck_eoh
# Check the macro
R$*            $: < $&{MessageIdCheck} >
# Clear the macro for the next message
R$*            $: $(storage {MessageIdCheck} $) $1
# Has a Message-Id: header
R< $+ >        $@ OK
# Allow missing Message-Id: from local mail
R$*            $: < $&{client_name} >
R< >           $@ OK
R< $=w >       $@ OK
# Otherwise, reject the mail
R$*            $#error $: 553 Header Error

Keep in mind the Message-Id: header is not a required header and is not a guaranteed spam indicator. This ruleset is an example and should probably not be used in production.

back to top 5.1.4.7. check_etrn 

The check_etrn ruleset is passed the parameter of the SMTP ETRN command. It can accept or reject the command.

back to top 5.1.4.8. check_expn 

The check_expn ruleset is passed the user name parameter of the SMTP EXPN command. It can accept or reject the address.

back to top 5.1.4.9. check_vrfy 

The check_vrfy ruleset is passed the user name parameter of the SMTP VRFY command. It can accept or reject the command.

back to top 5.1.4.10. trust_auth 

(See also cf/README: SMTP AUTH)

The trust_auth ruleset is passed the AUTH= parameter of the SMTP MAIL command. It is used to determine whether this value should be trusted. In order to make this decision, the ruleset may make use of the various ${auth_*} macros. If the ruleset does resolve to the $#error mailer the AUTH= parameter is not trusted and hence not passed on to the next relay.

back to top 5.1.4.11. tls_client 

(See also cf/README: STARTTLS, Adding New Mailers Or Rulesets)

The tls_client ruleset is called when sendmail acts as server, after a STARTTLS command has been issued, and from check_mail. The parameter is the value of ${verify} and STARTTLS or MAIL, respectively. If the ruleset does resolve to the $#error mailer, the appropriate error code is returned to the client.

back to top 5.1.4.12. tls_server 

(See also cf/README: STARTTLS, Adding New Mailers Or Rulesets)

The tls_server ruleset is called when sendmail acts as client after a STARTTLS command (should) have been issued. The parameter is the value of ${verify}. If the ruleset does resolve to the $#error mailer, the connection is aborted (treated as non-deliverable with a permanent or temporary error).

back to top 5.1.4.13. tls_rcpt 

(See also cf/README: STARTTLS, Adding New Mailers Or Rulesets)

The tls_rcpt ruleset is called each time before a RCPT TO command is sent. The parameter is the current recipient. If the ruleset does resolve to the $#error mailer, the RCPT TO command is suppressed (treated as non-deliverable with a permanent or temporary error). This ruleset allows to require encryption or verification of the recipient's MTA even if the mail is somehow redirected to another host. For example, sending mail to luke@endmail.org may get redirected to a host named death.star and hence the tls_server ruleset won't apply. By introducing per recipient restrictions such attacks (e.g., via DNS spoofing) can be made impossible.

back to top 5.1.4.14. srv_features 

(See also cf/README: STARTTLS, Adding New Mailers Or Rulesets)

The srv_features ruleset is called with the connecting client's host name when a client connects to sendmail. This ruleset should return $# followed by a list of options (single characters delimited by white space). If the return value starts with anything else it is silently ignored. Generally upper case characters turn off a feature while lower case characters turn it on.

The option `S' causes the server not to offer STARTTLS. This is useful to interact with MTAs/MUAs that have broken STARTTLS implementations by simply not offering it.
`V' turns off the request for a client certificate during the TLS handshake.
Option `A' and `P' suppress SMTP AUTH and PIPELINING, respectively. The ruleset may return $#temp to indicate that there is a temporary problem determining the correct features, e.g., if a map is unavailable. In that case, the SMTP server issues a temporary failure and does not accept email.

back to top 5.1.4.15. try_tls 

The try_tls ruleset is called when sendmail connects to another MTA. If the ruleset does resolve to the $#error mailer, sendmail does not try STARTTLS even if it is offered. This is useful to interact with MTAs that have broken STARTTLS implementations by simply not using it.

back to top 5.1.4.16. authinfo 

(See also cf/README: SMTP AUTH)

The authinfo ruleset is called when sendmail tries to authenticate to another MTA. It should return $# followed by a list of tokens that are used for SMTP AUTH. If the return value starts with anything else it is silently ignored. Each token is a tagged string of the form:

"TDstring" (including the quotes), where

tag description
T Tag which describes the item
D Delimiter: ':' simple text follows '=' string is base64 encoded
string Value of the item

Valid values for the tag are:

value description
U user (authorization) id
I authentication id
P password
R realm
M list of mechanisms delimited by spaces

If this ruleset is defined, the option Default- AuthInfo is ignored (even if the ruleset does not return a ``useful'' result).

back to top 5.1.4.17. queuegroup 

(See also cf/README: FEATURE(`queuegroup'))

The queuegroup ruleset is used to map a recipient address to a queue group name. The input for the ruleset is a recipient address as specified by the "SMTP RCPT" command. The ruleset should return $# followed by the name of a queue group. If the return value starts with anything else it is silently ignored. See the section about QueueGroups and Queue Directories for further information.

back to top 5.1.5. IPC mailers 

Some special processing occurs if the ruleset zero resolves to an IPC mailer (that is, a mailer that has "[IPC]" listed as the Path in the M configuration line. The host name passed after $@ has MX expansion performed if not delivering via a named socket; this looks the name up in DNS to find alternate delivery sites.

The host name can also be provided as a dotted quad or an IPv6 address in square brackets; for example:

[128.32.149.78]

or

[IPv6:2002:c0a8:51d2::23f4]

This causes direct conversion of the numeric value to an IP host address.

The host name passed in after the $@ may also be a colon-separated list of hosts. Each is separately MX expanded and the results are concatenated to make (essentially) one long MX list. The intent here is to create fake MX records that are not published in DNS for private internal networks.

As a final special case, the host name can be passed in as a text string in square brackets:

[ucbvax.berkeley.edu]

This form avoids the MX mapping.

N.B.: This is intended only for situations where you have a network firewall or other host that will do special processing for all your mail, so that your MX record points to a gateway machine; this machine could then do direct delivery to machines within your local domain. Use of this feature directly violates RFC 1123 section 5.3.5: it should not be used lightly.