Re: Should I partition this table?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: AlexK <alkuzo(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Should I partition this table?
Date: 2014-07-10 20:16:05
Message-ID: CAMkU=1yJdOdoP+1UOWtc98m+iy6XLGFiqZBqQ4ncqpuBa+UE8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jul 10, 2014 at 8:20 AM, Bill Moran <wmoran(at)potentialtech(dot)com>
wrote:

> On Thu, 10 Jul 2014 07:59:20 -0700 (PDT) AlexK <alkuzo(at)gmail(dot)com> wrote:
>
> > My table currently uses up 62 GB of storage, and it has 450 M rows. This
> > narrow table has a PK on (ParentID, ChildNumber), and it has between 20K
> and
> > 50K of child rows per parent.
> >
> > The data is inserted daily, rarely modified, never deleted. The
> performance
> > of modifications is not an issue. The only select from it is as follows:
> >
> > SELECT <column_lis> FROM MyChildTable WHERE ParentID=?
> > ORDER BY ChildNumber;
> >
> > The selects are frequent, and their performance is essential.
>

How is their performance currently?

> >
> > Would you advise me to partition this table?
>

No, not based on the current info. There is no reason to think
partitioning would improve the performance that matters to you. You said
that child rows for the same parent are all inserted at the same time, so
they should naturally be well-clustered. That will be important for
performance once the data exceeds what can be cached. If that clustering
did not occur naturally then you might benefit from imposing it, and
partitioning might be an import part of doing that. But it sounds like you
will not need to worry about that.

>
> In general, yes, given the information you provided. A parition on
> ParentID % $something should improve performance.

PostgresSQL's constraint exclusion logic is not smart enough to turn a
simple equality into a mod equality. Which means every select query would
have to include "AND ParentID % something = (:1 % somthing)" in addition to
the primary clause "ParentID=:1", in order to benefit from constraint
exclusion. That would be very unnatural, annoying, and error prone. Range
partitioning would be better, if any partitioning is needed at all.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2014-07-10 20:23:11 Re: Should I partition this table?
Previous Message Yves Dorfsman 2014-07-10 19:16:10 Re: checkpoint