Re: Slow lmtpd

From: Andre Nathan (no email)
Date: Mon Mar 05 2007 - 10:19:24 EST

  • Next message: Joseph Brennan: "reconstruct deletes messages"

    On Sat, 2007-03-03 at 14:23 +1100, Rob Mueller wrote:
    > %util - Percentage of CPU time during which I/O requests were issued to the
    > device (bandwidth utilization for the device). Device saturation occurs when
    > this value is close to 100%.

    Can values way above 100% be trusted? If so, it's pretty bad (this is
    from a situation where there are 200 lmtp processes, which is the
    current limit I set):

    avg-cpu: %user %nice %system %iowait %idle
               2.53 0.00 5.26 89.98 2.23

    Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
    avgrq-sz avgqu-sz await svctm %util
    etherd/e0.0
                 0.00 0.00 5.87 235.02 225.10 2513.77 112.55 1256.88
    11.37 0.00 750.32 750.32 18074.51

    avg-cpu: %user %nice %system %iowait %idle
               1.72 0.00 3.73 94.45 0.10

    Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
    avgrq-sz avgqu-sz await svctm %util
    etherd/e0.0
                 0.00 0.00 4.44 140.73 317.74 1125.00 158.87 562.50
    9.94 0.00 2500.46 2500.46 36296.94

    > The other thing of interest would be the load on the machine, and processes
    > in D state.

    Load average tends to get really high. It starts increasing really fast
    after the number of lmtpd processes reaches the limit set in cyrus.conf,
    and can easily get to 150 or 200. One of the moments where the problem
    becomes significant is when our MTAs run their deferred queue. We have
    around a dozen MTAs, and when they all run their queues, there is an
    increase in the number of connections to lmtpd. While these are very
    quick on our other mailboxes, in this one they take a lot of time to
    finish, and most of the times I have to restart cyrus, because it never
    reduces the amount of processes again, and thus connections start being
    refused. The difference between the two kinds of servers are:

    - The ones that don't have the problem use local disks instead of AoE
    - The ones that don't have the problem are limited to 2000 domains
    (around 8000 accounts), while the one using the AoE storage serves 4000
    domains (around 20000 accounts).

    Anyone running cyrus with that many accounts?

    > ps auxw | grep -v ' S'

    root 1743 0.0 0.0 0 0 ? D Mar01 0:05
    [xfssyncd]
    root 3116 0.0 0.0 0 0 ? D Mar01 0:01
    [xfssyncd]
    cyrus 15593 0.0 0.3 36288 13660 ? D 11:48 0:00 imapd
    cyrus 16360 0.0 0.3 37752 14360 ? D 11:54 0:00 imapd
    cyrus 17161 0.0 0.3 36304 13648 ? D 11:59 0:00 imapd
    cyrus 17182 0.0 0.0 120736 3268 ? D 12:00 0:00 lmtpd
    cyrus 17891 0.0 0.0 120872 3108 ? D 12:04 0:00 lmtpd
    cyrus 17897 0.0 0.0 120696 3312 ? D 12:04 0:00 lmtpd
    cyrus 18265 0.0 0.0 120896 3540 ? D 12:07 0:00 lmtpd
    cyrus 18302 0.0 0.0 120760 3432 ? D 12:07 0:00 lmtpd
    cyrus 18336 0.0 0.0 120720 2684 ? D 12:07 0:00 lmtpd
    cyrus 18441 0.0 0.0 120684 2944 ? D 12:08 0:00 lmtpd
    cyrus 18590 0.0 0.0 120920 3156 ? D 12:09 0:00 lmtpd
    cyrus 18591 0.0 0.0 120724 2584 ? D 12:09 0:00 lmtpd
    cyrus 18592 0.0 0.0 121332 2796 ? D 12:09 0:00 lmtpd
    cyrus 18612 0.0 0.0 120716 3224 ? D 12:09 0:00 lmtpd
    cyrus 18613 0.0 0.0 120716 3140 ? D 12:09 0:00 lmtpd
    cyrus 18632 0.0 0.0 120696 3072 ? D 12:09 0:00 lmtpd
    cyrus 18641 0.0 0.0 120676 2864 ? D 12:09 0:00 lmtpd
    cyrus 18643 0.0 0.0 120720 2696 ? D 12:09 0:00 lmtpd
    cyrus 18656 0.0 0.0 120692 3340 ? D 12:09 0:00 lmtpd
    cyrus 18657 0.0 0.0 120676 2996 ? D 12:09 0:00 lmtpd
    cyrus 18658 0.0 0.0 120716 2804 ? D 12:09 0:00 lmtpd
    cyrus 18669 0.0 0.0 120680 2812 ? D 12:09 0:00 lmtpd
    cyrus 18671 0.0 0.0 120716 2712 ? D 12:09 0:00 lmtpd
    cyrus 18939 0.0 0.0 120692 2732 ? D 12:11 0:00 lmtpd
    cyrus 18941 0.0 0.0 120716 3148 ? D 12:11 0:00 lmtpd
    cyrus 18942 0.0 0.0 120752 2924 ? D 12:11 0:00 lmtpd
    cyrus 18944 0.0 0.0 120704 2612 ? D 12:11 0:00 lmtpd
    cyrus 18947 0.0 0.0 120688 2676 ? D 12:11 0:00 lmtpd
    cyrus 18948 0.0 0.0 120688 2336 ? D 12:11 0:00 lmtpd
    cyrus 18950 0.0 0.0 120684 2920 ? D 12:11 0:00 lmtpd
    cyrus 18951 0.0 0.0 124080 2764 ? D 12:11 0:00 lmtpd
    cyrus 18978 0.0 0.0 120712 3304 ? D 12:11 0:00 lmtpd
    cyrus 18979 0.0 0.0 120740 2872 ? D 12:11 0:00 lmtpd
    cyrus 19014 0.0 0.0 120712 2656 ? D 12:11 0:00 lmtpd
    cyrus 19016 0.0 0.0 120708 2880 ? D 12:11 0:00 lmtpd
    cyrus 19089 0.0 0.0 120692 2596 ? D 12:12 0:00 lmtpd
    cyrus 19123 0.0 0.3 36240 13540 ? D 12:12 0:00 imapd
    cyrus 19153 0.0 0.0 38012 3076 ? D 12:12 0:00 pop3d
    cyrus 19179 0.0 0.0 120812 2660 ? D 12:12 0:00 lmtpd
    cyrus 19183 0.0 0.0 120712 2924 ? D 12:12 0:00 lmtpd
    cyrus 19199 0.0 0.0 120696 2644 ? D 12:12 0:00 lmtpd
    cyrus 19200 0.0 0.0 120712 3236 ? D 12:12 0:00 lmtpd
    cyrus 19201 0.0 0.0 120692 2668 ? D 12:12 0:00 lmtpd
    cyrus 19263 0.0 0.0 122076 2836 ? D 12:13 0:00 lmtpd
    cyrus 19292 0.0 0.0 120712 2672 ? D 12:13 0:00 lmtpd
    cyrus 19298 0.0 0.0 121168 2764 ? D 12:13 0:00 lmtpd
    cyrus 19329 0.0 0.0 120796 2716 ? D 12:13 0:00 lmtpd
    cyrus 19338 0.0 0.0 120696 2524 ? D 12:13 0:00 lmtpd
    cyrus 19344 0.0 0.0 36536 3308 ? D 12:13 0:00 imapd
    cyrus 19372 0.0 0.0 121688 2640 ? D 12:13 0:00 lmtpd
    cyrus 20020 0.0 0.0 35940 2952 ? D 12:17 0:00 pop3d
    cyrus 20495 0.0 0.0 35936 2488 ? D 12:20 0:00 pop3d
    root 20629 0.0 0.0 2764 820 pts/0 R+ 12:21 0:00 ps auxw

    Thanks for the help,
    Andre

    ----
    Cyrus Home Page: http://cyrusimap.web.cmu.edu/
    Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
    List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
    

  • Next message: Joseph Brennan: "reconstruct deletes messages"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD