Re: Postgresql partitioning - single hot table or distributed

From: Vick Khera <vivek(at)khera(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql partitioning - single hot table or distributed
Date: 2010-06-30 18:19:08
Message-ID: AANLkTin65SfCKzT0T31TBdKObSLV7cMu1L3BUiiAdFhE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jun 30, 2010 at 10:20 AM, sam mulube <sam(dot)mulube(at)gmail(dot)com> wrote:
> Inserting directly into the specific partition is interesting, but if
> you're going to go down that route then aren't you starting to
> implement the partitioning yourself in application code. In that case
> what benefit does keeping the Postgresql partitioning in place
> actually give you?
>

Your benefit comes in very fast lookups when constraint exclusion (or
index query) are working for you. You also get simplified syntax for
doing joins against the entire data set. If however you always only
search on the ID, then there is really not much benefit other than
having smaller indexes. Smaller indexes are great when you have to
re-index. Instead of taking say 15 minutes to reindex the one big
table, you take 10-15seconds per partition, which means your
applications can move along with only a few seconds delay while you
reindex instead of being blocked.

For one of my tables I could have gone the trigger method for routing
inserts because those are onsies-twosies, but the main table I have it
is much more efficient to compute the partition up front since I do a
*lot* of inserts in a big batch.

The drawback to partitioning by an ID number using modulo is that for
constraint exclusion to work you have to actually add something like
"AND (my_id % 42) = 0" to match the constraint. The exclusion is not
done by executing the constraint, but by proving the constraint will
hold true for the WHERE clause. My reasoning was if I'm going to add
that to my select queries, I might as well just pick the table by
doing that arithmetic in my application up front. It is faster than
having the DB do the constraint proof for each of the 100 partitions.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vick Khera 2010-06-30 18:22:39 Re: reloading dump produces errors
Previous Message Geoffrey 2010-06-30 18:16:23 reloading dump produces errors