Replication is broken with modseq issue in 2.3.6

From: Bron Gondwana (no email)
Date: Wed Jul 05 2006 - 02:54:18 EDT

  • Next message: Andreas Hasenack: "Re: Mailstore filesystem"

    If I sound a little bitter, it's because I was up until 5:30am the other
    night after a hardware failure left us with corrupted filesystems on our
    master server, and fetching old messages from the replica returned
    blank responses. We eventually discovered that reconstruct could fix
    it, and 36(!!) hours later, all couple of terabytes of mailboxes have
    finished reconstructing and our users have largely stopped yelling at
    us.

    Unsurprising to those of us who lived through 2.3.4 and 2.3.5, it was
    related to the modseq field of the index file. Yet more proof that
    mixing unfinished new features with security updates in the 'stable'
    line is bad for your blood pressure.

    If you're running replication on any post 2.3.3 cyrus, your replica
    contains indexes with '0' as the modseq value. This means that your
    messages will not be fetchable until you either fix that value or
    patch Cyrus. Reconstruct from 2.3.6+ is one option for fixing it.

    otherwise ...

    I've spent a fair bit of today tracking down and testing the location
    in the code where that issue was being caused. See the attached
    'cyrus-modseqrepl-cvs.diff' file.

    CAVEATS:
    a) won't fix any already replicated messages
    b) if you're using CONDSTORE, this won't replicate the actual modseq,
       it just sets it to '1' on the replica. I considered trying to
       replicate the actual modseq value, but it looks like it requires
       changing the replication wire protocol, and that's a can of worms
       I'm really not interested in diving in to. Someone who's actually
       using CONDSTORE is in a lot better place to see what's needed and
       actually test it!

    That still sort of sucks, doesn't it. Mainly due to (a).

    Then inspiration hit, and I wish I'd thought of this back when I wrote
    the reconstruct and COPY patches that went into 2.3.6...

    See the attached 'cyrus-modseqfetch-cvs.diff' file. This implements the
    correct behaviour in all cases, all PRO no CON - and it's only one line!

    If fetchargs->changedsince is '0' then that means you want all messages
    regardless of the value of modseq.

    The other huge advantage of this approach is that it means that you
    don't have to be aware of modseq unless you're using it. The default
    '0' value of a freshly zeroed struct is no longer an accident waiting
    to happen.

    Even folder indexes "corrupted" by 2.3.4-5 will work just fine with
    this patch applied. All your replicated messages will magically
    appear again.

    Replication and CONDSTORE is still broken - but then it was never
    unbroken (in that queries on the replica won't return the same as
    queries on the master if they use modseq).

    Ken - please consider cyrus-modseqfetch-cvs.diff for immediate
    inclusion and prompt release of a 2.3.7.

    I imagine you're going to want to do more work so modseq values actually
    replicate rather than using cyrus-modseqrepl-cvs.diff as is - though it
    certainly doesn't hurt any, and means newly replicated messages will be
    readable by older cyrus back to 2.3.4 (index format is too new past there).

    Regards,

    Bron.


    ----
    Cyrus Home Page: http://asg.web.cmu.edu/cyrus
    Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
    List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
    



  • Next message: Andreas Hasenack: "Re: Mailstore filesystem"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD