Re: Query performance problems with partitioned tables

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com>
Cc: "Fei Liu" <fei(dot)liu(at)aepnetworks(dot)com>, "Andreas Haumer" <andreas(at)xss(dot)co(dot)at>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query performance problems with partitioned tables
Date: 2007-05-04 17:10:16
Message-ID: b42b73150705041010nf91213ci7ef41ee07e4c79bf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5/4/07, Scott Marlowe <smarlowe(at)g2switchworks(dot)com> wrote:
> On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote:
> > On 5/3/07, Fei Liu <fei(dot)liu(at)aepnetworks(dot)com> wrote:
> > > Hello, Andreas, I too am having exactly the same issue as you do.
> > > Comparing my partitioned and plain table performance, I've found that
> > > the plain tables perform about 25% faster than partitioned table. Using
> > > 'explain select ...', I see that constraints are being used so in
> > > partitioned tables fewer rows are examined. But still partitioned tables
> > > are 25% slower, what a let down.
> >
> > That's a little bit harsh. The main use of partitioning is not to
> > make the table faster but to make the maintenance easier. When
> > constraint exclusion works well for a particular query you can get a
> > small boost but many queries will break down in a really negative way.
> > So, you are sacrificing flexibility for easier maintenance. You have
> > to really be careful how you use it.
> >
> > The best case for partitioning is when you can logically divide up
> > your data so that you really only have to deal with one sliver of it
> > at a time...for joins and such. If the OP could force the constraint
> > exclusion (maybe by hashing the timestamp down to a period and using
> > that for where clause), his query would be fine. The problem is it's
> > not always easy to do that.
>
> Agree++
>
> I've been testing partitioning for a zip code lookup thing that was
> posted here earlier, and I partitioned a 10,000,000 row set into about
> 400 partitions. I found that selecting a range of areas defined by x/y
> coordinates was faster without any indexes. The same selection with one
> big table and one big (x,y) index took 3 to 10 seconds typically, same
> select against the partitions with no indexes took 0.2 to 0.5 seconds.

I was thinking about that problem....one approach I was playing with
was to normalize the 10mm table to zipcode (chopping off + 4) and then
doing bounding box ops on the zipcode (using earthdistance/gist) table
and also the detail table using tradictional tactics or gist. I think
this would give reasonable performance without partitioning (10mm
records doesn't scare me anymore!). If the records are frequently
updated you may want to TP anways though do to (pre-hot) vacuum
issues.

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2007-05-04 17:11:42 Re: Feature Request --- was: PostgreSQL Performance Tuning
Previous Message Sebastian Hennebrueder 2007-05-04 16:40:13 Re: Feature Request --- was: PostgreSQL Performance Tuning