Re: Recomendations for a 15000 Cyrus Mailboxes

From: Robert Mueller (no email)
Date: Tue Apr 10 2007 - 19:28:43 EDT

  • Next message: Jeffrey T Eaton: "Re: Mailbox subscriptions in a murder"

    > 1. Linux LVM over a 600 GB RAID 10 ( 4 x 300 GB)
    > 2. Which filesystem seems to be the better ? ext3 ? xfs ? reiserfs ?
    > 3. Which options to format the filesystem ? acording to the chosed
    > filesystem
    > 4. Which pop3 / imap proxy to use ?
    > 5. Single instance or multiple instances of cyrus ? taking in mind
    > that there should be the option to recover a mailbox or some mail
    > of a mailbox without having to shut down the whole cyrus system.
    > 6. Best way to perform backups ? LVM snapshots ? shutting down some
    > cyrus partitions ? RAID10 hot swap ?
    > 7. Any other suggestion will be welcome.

    Pointers to some previous posts:

    http://irbs.net/internet/info-cyrus/0702/0071.html
    http://www.irbs.net/internet/info-cyrus/0412/0042.html
    http://lists.andrew.cmu.edu/pipermail/info-cyrus/2006-October/024119.html
    http://blog.fastmail.fm/?p=592

    I'd summarise by saying:

    1. We don't need a global folder namespace, so we use completely
    separate cyrus stores & nginx as the frontend proxy rather than a murder
    setup
    2. Separate data (email files) & meta-data (cyrus.* files) onto separate
    spindles/drives. Not including squatter indexes, meta-data is about
    1/20th to 1/10th the size of the email data, so you can afford to use
    small + fast drives for the meta-data.
    3. If you want high availability, use replication
    4. We have a custom backup system, so can't comment on LVM I'm afraid
    5. At the size you're talking about, you probably don't need separate
    instances of cyrus. However if you plan to grow bigger in the future
    (number of users mostly), definitely think about it now
    6. Filesystems are always contentious. Copying from my comments in
    another post which I still think are relevant.

    ---
    I'd rate the general pros/cons of *linux* filesystems as:
    * ext3
    pros: most widely used; excellent recovery tools; full data journaling 
    available; best in the face of flakey hardware or disk caches that lie
    cons: performance just isn't that good in a large active user base
    * reiserfs
    pros: performs well with large active user base configuration, full data 
    journaling available
    cons: recovery tools generally work, but have been known to crash and
    can be 
    slow on large partitions; large mount time (will be fixed in 2.6.19), 
    apparently some concurrency issues with taking the BKL
    * xfs
    pros: fast on large files, good concurrency
    cons: no data journaling, only meta-data; not really "stable" when bugs
    like 
    this occur that even a xfs_repair wouldn't fix! 
    (http://oss.sgi.com/projects/xfs/faq.html#dir2)
    All the other filesystems I'd label as less used, which means that it's
    more 
    likely bugs to appear and wouldn't recommend for a production
    environment.
    ---
    In other words, I'd choose ext3 or reiserfs. We happen to use reiserfs
    because it does perform better from our tests and we push our hardware
    quite a bit.
    However I'd agree that the recovery tools for reiserfs aren't as well
    tested as the ext3 tools. In general if a problem occurs on a big
    partition, it'll take at least days to fsck it, and there's no guarantee
    it will work.
    Having said that, we've found reiserfs to be very reliable assuming two
    golden rules:
    ---
    1. You MUST have hardware that doesn't lie about it's write cache. When
    the 
    filesystem tells the device driver to sync to disk, and the disk says
    it's 
    done, it must be done
    (http://community.livejournal.com/lj_dev/670215.html - 
    see the Disk cache issues)
    2. Your hardware must be IO reliable, it must never report any "write"
    or 
    "read" IO errors at the sector level
    ---
    With good hardware, we've never had problems. Should there ever be
    problems, we use replication as our high-availability fallback which is
    always better than waiting for a fsck anyway.
    > > Both partitions were formatted with the following commands:
    > >    mkfs -t ext3 -j -m 1 -O dir_index /dev/sdb1
    > 
    > Yep, "-O dir_index" is the important bit.  With that the performance
    > difference between ext3 and other filesystems is dramatically
    > diminished.
    That's what you'd think, but the fact is, it's just NOT true from our
    testing. We tried reiserfs, ext3 & ext3 + dir_index, and in a real world
    production system, dir_index made no noticeable difference (and yes, we
    didn't just tune2fs the bit on and test, we had a completely separate
    partition we copied all the data to and used lsattr to check the
    directories were actually indexed). I was surprised as well.
    Rob
    ----------
    Sign up at http://fastmail.fm for fast, ad free, IMAP accessible email
    ----
    Cyrus Home Page: http://cyrusimap.web.cmu.edu/
    Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
    List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
    

  • Next message: Jeffrey T Eaton: "Re: Mailbox subscriptions in a murder"





    Hosted Email Solutions

    Invaluement Anti-Spam DNSBLs



    Powered By FreeBSD   Powered By FreeBSD