Re: Postfix problems when system spool has files

From: Quanah Gibson-Mount (no email)
Date: Wed Feb 25 2009 - 16:38:43 EST

  • Next message: LuKreme: "Moving from uw-imap and Courier to just Courier"

    --On Tuesday, February 24, 2009 9:26 AM -0500 Wietse Venema
    <> wrote:

    >> Further investigation tracks this down to something failing with DNS
    >> resolution after a while. Don't know why, but it does seem to be a
    >> problem with OS X and catastrophic failure.
    >
    > Since I don't maintain copies of every Postfix-enabled platform (*)
    > I will rely on you to provide accurate observations.
    >
    > Wietse
    >
    > (*) I have a couple representaive platforms running in VMware, but
    > that is only for testing my own Postfix distribution.

    I'm definitely convinced it is an OSX 10.5 bug and not a postfix bug at
    this point, but hopefully this can help others if they ever run into it. I
    don't have a solution at this point. Here's more gory details. Two
    clients have had this occur in different circumstances, but in both cases
    where OSX was forced to go down uncleanly.

    For Client A, it started after they had a power outage. For Client B, it
    happened after they had a HD failure. I don't know for client B how they
    recovered the failed HD. In both cases, after the failure, after postfix
    is running for a while, it starts complaining that it can't do startTLS
    operations to LDAP. In addition, mail files start showing up in
    /var/spool/postfix/maildrop. Further investigation revealed that these
    mail files are being generated by sudo. The same sudo command never
    generated them prior to the crashes of these servers.

    I was finally able to get access to client B's server while the startTLS
    failures were occurring. At that point I turned up the debuglevel in the
    LDAP map file it was attempting to use to 7. This resulted in the
    following being logged:

    ldap_connect_to_host: getaddrinfo failed: Temporary failure in name
    resolution

    I then disabled startTLS and verified that connections still failed with
    the same issue. I.e., startTLS was never the problem (which is good. :P ).

    Further examination of the system logs showed that other processes were
    also having problems resolving the host via DNS:

    auth failed: curl_easy_perform: error(6): Couldn't resolve host 'domain.com'

    In both cases, the host in question is the local system, which has its
    correct entries in /etc/hosts, and nslookup, dig, and host commands all
    worked fine for me as multiple users.

    The files being generated by sudo show that it is failing to find users
    that don't exist in /etc/passwd (which for OSX, is all users except the
    ones created by apple for system use):

    T1235430448 195461Arewrite_context=localFSystem AdministratorSrootMTo:
    rootN
    From: 502N:Subject: *** SECURITY information for domain.com
    ***NN�domain.com : Feb 23 23:07:28 : 2 : uid 502 does not exist in the
    passwd file! ; TTY=unknown ; PWD=unknown ; USER=root ;
    COMMAND=/opt/zimbra/libexec/zmmtastatusNXRrootE

    Apparently this has bitten other people:

    <http://discussions.apple.com/thread.jspa?threadID=1527762&tstart=421>

    If we ever get a solution from Apple, I will update further.

    It is interesting to note that stopping/restarting postfix resolves the
    issue for a few hours. Then it will just happen again until it is
    restarted.

    --Quanah

    --
    Quanah Gibson-Mount
    Principal Software Engineer
    Zimbra, Inc
    --------------------
    Zimbra ::  the leader in open source messaging and collaboration
    

  • Next message: LuKreme: "Moving from uw-imap and Courier to just Courier"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD