Re: PostgreSQL to host e-mail?

From: Grega Bremec <gregab(at)p0f(dot)net>
To: "Charles A(dot) Landemaine" <landemaine(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL to host e-mail?
Date: 2007-01-05 03:10:20
Message-ID: 459DC19C.20100@p0f.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Charles A. Landemaine wrote:
| I'm building an e-mail service that has two requirements: It should
| index messages on the fly to have lightening search results, and it
| should be able to handle large amounts of space. The server is going
| to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd
| like to know how PostgreSQL could handle such a large amount of data.
| How much RAM would I need? I expect my users to have a 10GB quota per
| e-mail account.
| Thanks for your advice,
|

Hello, Charles.

I'll second people's suggestions to stay away from RAID5; the kind of
workload a mail storage will have is one that is approximately an even mix
of writes (in database terms, INSERTs, UPDATEs and DELETEs) and reads, and
we all know RAID5 is a loser when it comes to writing a lot, at least when
you're building arrays with less than 10-15 drives. I'd suggest you go for
RAID10 for the database cluster and an extra drive for WAL.

Another point of interest I'd like to mention is one particular aspect of
the workflow of an e-mail user: we will typically touch the main inbox a lot
and leave most of the other folders pretty much intact for most of the time.
This suggests per-inbox quota might be useful, maybe in addition to the
overall quota, because then you can calculate your database working set more
easily, based on usage statistics for a typical account. Namely, if the
maximum size of an inbox is x MB, with y% average utilization, and you plan
for z users, of which w% will be typically active in one day, your database
working set will be somewhere in the general area of (x * y%) * (z * w%) MB.
Add to that the size of the indexes you create, and you have a very
approximate idea of the amount of RAM you need to place in your machines to
keep your performance from becoming I/O-bound.

The main reason I'm writing this mail though, is to suggest you take a look
at Oryx, http://www.oryx.com/; They used to have this product called
Mailstore, which was designed to be a mail store using PostgreSQL as a
backend, and has since evolved to a bit more than just that, it seems.
Perhaps it could be of help to you while building your system, and I'm sure
the people at Oryx will be glad to hear from you while, and after you've
built your system.

Kind regards,
- --
~ Grega Bremec
~ gregab at p0f dot net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFFncGcfu4IwuB3+XoRA9Y9AJ0WA+0aooVvGMOpQXGStzkRNVDCjwCeNdfs
CArTFwo6geR1oRBFDzFRY/U=
=Y1Lf
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Arnau 2007-01-05 11:02:21 Partitioning
Previous Message Tom Lane 2007-01-05 03:04:05 Re: Slow Query on Postgres 8.2