From: Peter Rabbitson (no email)
Date: Sat May 05 2007 - 20:46:00 EDT
Wietse Venema wrote:
> Peter Rabbitson:
>> Wietse Venema wrote:
>>> Peter Rabbitson:
>>>> I am experiencing 421 errors between my secondary and primary MXes, and
>>>> it seems it is cause by lack of connection caching.
>>> What is the error message?
>> There is no error message as such, see below.
> Any reasonable SMTP server sends 421 followed by some text that
> explains why it hangs up.
I apologize, I thought that 421 in the MTA world is as self-explanatory
as say 403 in the http world.
> Having looked at the text below, I think your problem is that
> you are making an insane number of SIMULTANEOUS connections
> to the primary MX host.
This is correct.
>>>> misses to explain what is "high volume of mail in the active queue".
>>>> When is exactly connection caching activated?
>>> Roughly, it is activated when the active queue contains another
>>> message before the current delivery is completed.
>> If the primary MX is down for an extended period of time and a large
>> queue accumulates on the backup, all messages are rushed to the primary
>> MX in what it seems separate smtp connections. At least I was able to
>> count as many smtp processes in `ps` as 2/3 of the number of queued
>> messages, right after I issue `postfix flush`. If I specify explicit
>> caching for the particular mx host, things work as expected. I guess
>> there is not enough time for the caching on demand to activate when
>> doing a flush or having enough queued messages to simulate one.
> This is not a surprise.
> If the number of SIMULTANEOUS connections is 2/3 the number of
> queued messages, then most connections will never be reused because
> the mail is already delivered.
> I suggest that you revert to no more than 10-20 SIMULTANEOUS
> connections to the primary MX (or to any machine).
I never changed the defaults for those (postconf -n follows at the end
of the message)
> If you do that, not only will the primary MX perform better, you
> will also see connection reuse happen automatically.
I did more testing, using explicit smtp_connection_cache_destinations
and I still had the same experience. Rereading the documentation for the
n-th time I noticed the following in several places:
(in reference to *_destination_recipient_limit)
Setting this parameter to a value of 1 changes the meaning of
*_destination_concurrency_limit from concurrency per domain into
concurrency per recipient.
Does this by chance mean that *_destination_concurrency_limit refers to
individual _domains_ and not individual MTAs? I am relaying mail for 6
domains, all having the same primary MX (which is the one getting badly
hammered after being down for a while).
Thanks for the help
Arx:/etc/postfix# postconf -n
address_verify_map = btree:/var/cache/postfix/verify.db
address_verify_negative_cache = yes
address_verify_negative_expire_time = 1d
address_verify_negative_refresh_time = 1h
address_verify_poll_count = 2
address_verify_poll_delay = 2s
address_verify_positive_expire_time = 31d
address_verify_positive_refresh_time = 7d
alias_database = $alias_maps
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
backwards_bounce_logfile_compatibility = no
biff = no
bounce_queue_lifetime = 12h
bounce_size_limit = 20000
config_directory = /etc/postfix
hash_queue_depth = 1
hash_queue_names = ''
in_flow_delay = 0
inet_interfaces = all
inet_protocols = ipv4
mailbox_command = procmail -a "$EXTENSION"
mailbox_size_limit = 0
maximal_queue_lifetime = 7d
message_size_limit = 0
minimal_backoff_time = 15m
mydestination = $mydomain, localhost.$mydomain, localhost
mydomain = rabbit.us
myhostname = arx.rabbit.us
mynetworks = 127.0.0.0/8 192.168.13.0/24 10.0.13.0/24 $inet_interfaces
myorigin = $mydomain
queue_directory = /var/spool/postfix
queue_minfree = 1000000
recipient_delimiter = +
relay_domains = <6 relayed domains withheld, all with same primary MX>
smtp_bind_address = 18.104.22.168
smtp_connect_timeout = 5s
smtp_connection_reuse_time_limit = 5m
smtp_helo_timeout = 1m
smtp_mail_timeout = 1m
smtp_mx_address_limit = 0
smtp_quit_timeout = 10s
smtp_skip_quit_response = yes
smtpd_authorized_verp_clients = $mynetworks
smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
smtpd_delay_reject = yes
smtpd_error_sleep_time = 3s
smtpd_hard_error_limit = 20
smtpd_junk_command_limit = 20
smtpd_recipient_limit = 200
smtpd_recipient_restrictions = permit_mynetworks
smtpd_sender_restrictions = reject_unknown_sender_domain
smtpd_soft_error_limit = 5
smtpd_timeout = 30s
syslog_name = postfix