Re: Recommendations for partitioning?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Dave Johansen <davejohansen(at)gmail(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Recommendations for partitioning?
Date: 2013-12-07 20:29:09
Message-ID: CAMkU=1wWcL9rzMzvsaV_OrqtexieVe16dEZvd2Q70hmB=JqLug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Dec 5, 2013 at 7:36 AM, Dave Johansen <davejohansen(at)gmail(dot)com>wrote:

> I'm managing a database that is adding about 10-20M records per day to a
> table and time is a core part of most queries,
>

What is the nature of how the time column is used in the queries?
Depending on how it is used, you might not get much improvement at all, or
you might get N fold improvement, or you might find that re-designing your
indexes could get you the same query improvement that partitioning would,
but with less headache.

> so I've been looking into seeing if I need to start using partitioning
> based on the time column and I've found these general guidelines:
>
> Don't use more than about 50 paritions (
> http://www.postgresql.org/message-id/17147.1271696670@sss.pgh.pa.us )
> Use triggers to make the interface easier (
> https://wiki.postgresql.org/wiki/Table_partitioning#Trigger-based and
> http://stackoverflow.com/questions/16049396/postgres-partition-by-week )
>

Using triggers slows INSERTs down by a lot (unless they were already slow
due to the need to hit disk to maintain the indexes or something like
that). Are you sure you can handle that slow down, given your insertion
rate? You could get the best of both worlds by having your bulk loaders
target the correct partition directly, but also have the triggers on the
parent table for any programs that don't get the message.

>
> The only data I found fell inline with what you'd expect (i.e. speeds up
> selects but slows down inserts/updates
> http://www.if-not-true-then-false.com/2009/performance-testing-between-partitioned-and-non-partitioned-postgresql-tables-part-3/)
>

One of the big benefits of partitioning can be to speed up insertions
tremendously, by keeping the hot part of the indices that need to be
maintained upon insertion together in shared_buffers.

>
> So I was thinking that partitioning based on month to keep the number of
> partitions low, so that would mean about 0.5G records in each table. Does
> that seem like a reasonable number of records in each partition? Is there
> anything else that I should consider or be aware of?
>

How will data be expired? Hows does the size of one of your intended
partitions compare to your RAM and shared_buffers.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2013-12-07 20:37:12 Re: Recommendations for partitioning?
Previous Message desmodemone 2013-12-07 17:09:19 Re: Recommendations for partitioning?