Re: cyrus murder, mupdate sucking up CPU

From: Patrick Radtke (no email)
Date: Tue Mar 07 2006 - 10:38:05 EST

  • Next message: (no email): "Re: quota script"

    We have the same/similar problems with mupdate on RHEL4.

    Our problem usually shows up when we are creating new users or if
    users are creating new mailboxes. The mailbox creation may hang or go
    extremely slow (and eventually start hanging). This seems to be
    linked to when a frontend restarts and is synching its mailbox list.

    Mupdate uses 99% of cpu apparently doing nothing. If we do the strace
    -f -p then the the process does idle, but also stops doing anything
    at all (nothing is logged to the log files from that point on).

    If we restart the murder master, then all our frontends (10) and
    backends (14) reconnect and the murder master starts dropping
    connections, and the frontends connect again and then get
    disconnected (and so on). we're still investigating this one. The
    worker thread count keeps increasing as the frontends keep
    reconnecting. It seems our only way to restart the murder master is
    by using iptables to block connections from the backends and then
    slowly re-allow connections once the frontends have re-synched. It
    appears that frontends re-synching and backends creating mailboxes at
    the same time do not get along in our setup.

    -Patrick

    On Mar 3, 2006, at 2:53 PM, Aleksandar Milivojevic wrote:

    > I've asked about this problem earlier while trying out version
    > 2.3.1. I've just compiled 2.3.3 (Simon's SRPM package) and still
    > having the same problem. This is the show stopper for me for
    > upgrading from 2.2 to 2.3.
    >
    > The problem is mupdate process sucks all CPU cycles it can get.
    >
    > Now for the weird stuff.
    >
    > Running strace -p 3990 (3990 being PID of mupdate process) just
    > shows it waiting in accept system call.
    >
    > However, running strace -f -p 3990 showed this:
    >
    > [pid 3995] clock_gettime(CLOCK_REALTIME, <unfinished ...>
    > [pid 3998] futex(0x8122134, FUTEX_WAKE, 1 <unfinished ...>
    > [pid 3995] <... clock_gettime resumed> {1141412737, 901972000}) = 0
    > [pid 3994] <... futex resumed> ) = 0
    > [pid 3998] <... futex resumed> ) = 1
    > [pid 3995] futex(0x8119fe0, FUTEX_WAKE, 1 <unfinished ...>
    > [pid 3994] futex(0x8122134, FUTEX_WAKE, 1 <unfinished ...>
    > [pid 3998] gettimeofday( <unfinished ...>
    > [pid 3995] <... futex resumed> ) = 0
    > [pid 3994] <... futex resumed> ) = 0
    > [pid 3998] <... gettimeofday resumed> {1141412737, 902155}, NULL) = 0
    > [pid 3995] futex(0x8119fe4, FUTEX_WAIT, -106641967, {59,
    > 994760000} <unfinished ...>
    > [pid 3994] time( <unfinished ...>
    > [pid 3998] clock_gettime(CLOCK_REALTIME, <unfinished ...>
    > [pid 3995] <... futex resumed> ) = -1 EAGAIN (Resource
    > temporarily unavailable)
    > [pid 3994] <... time resumed> NULL) = 1141412737
    > [pid 3998] <... clock_gettime resumed> {1141412737, 902307000}) = 0
    > [pid 3995] futex(0x8119fe0, FUTEX_WAIT, 2, NULL <unfinished ...>
    > [pid 3994] select(7, [6], NULL, NULL, {0, 0}finished ...>
    > [pid 3992] <... clock_gettime resumed> {1141412737, 903913000}) = 0
    >
    > Now the strange thing, after I exit strace, mupdate starts to
    > behave and goes to idling. Attaching again to it with strace still
    > shows the same output, but it is not consuming almost any CPU
    > cycles. However, it is still huge, around 170MB.
    >
    > Even more strange is that if I restart it (stop Cyrus, start it
    > again), the new mupdate process also seems to work OK!? Reboot the
    > system, and get the same problem again.
    >
    > Could it be that I'm hitting a bug somewhere else in the system
    > (like kernel)? Is anybody else running Cyrus 2.3.x in murder
    > configuration on CentOS4 or RHEL4 (update 2)?
    >
    >
    > ----------------------------------------------------------------
    > This message was sent using IMP, the Internet Messaging Program.
    >
    >
    > ----
    > Cyrus Home Page: http://asg.web.cmu.edu/cyrus
    > Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
    > List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
    >

    ----
    Cyrus Home Page: http://asg.web.cmu.edu/cyrus
    Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
    List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
    

  • Next message: (no email): "Re: quota script"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD