Skip site navigation (1) Skip section navigation (2)

Re: Query performance problems with partitioned tables

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Guillaume Cottenceau" <gc(at)mnc(dot)ch>
Cc: "Andreas Haumer" <andreas(at)xss(dot)co(dot)at>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query performance problems with partitioned tables
Date: 2007-04-30 15:35:55
Message-ID: 87slah98o4.fsf@oxford.xeocode.com (view raw or flat)
Thread:
Lists: pgsql-performance
"Guillaume Cottenceau" <gc(at)mnc(dot)ch> writes:

> I think this is the last claimed point which is incorrect. Pg has
> no general guarantee the partitions actually create a disjoint
> set, even with the CHECK constraints. Pg can only optimize by
> avoiding scanning the partitions inside which no satisfactory
> data could be found by the CHECK constraint, but I think it's not
> possible (too complicated) to infer that any found row in your
> other partitions would not be in the final resultset because of
> 1. the query's resultset order 2. the limit 3. the actual
> conditions in the CHECK constraints (there is no direct way to
> see that timestamps in your 200704 partition are greater than
> timsteamp in your 200601 partition).

I think the answer is that yes there are a number of query transformations
that could be done for partitioned tables that we're not doing currently.

Generally speaking we need to look at each type of plan node and figure out
whether it can usefully be pushed down below the Append node and how we can
determine when that can be done. 

So if each arm of the Append was already in order we could easily push the
LIMIT inside the Append (and duplicating the work above the Append). But that
wouldn't save us any work because we only generate rows from the partitions as
they're needed anyways.

In your example we could save work because the Sort needs all the rows before
it starts. So we could sort each arm and apply the limit before passing it up
to the outer Sort. That might save work in this case.

But figuring out when that's less work than just sorting all of them is
tricky. If the LIMIT is 1,000 and you have ten arms of 1,000 tuples each then
you aren't going to save any work doing this. But if the LIMIT is 1 and you
have few arms with many tuples then you could save a lot of work.

Actually I happen to have been reading up on algorithms related to this this
weekend. It's possible to implement a LimitUnsorted in linear-time which would
fetch the first n records according to some sort key without actually sorting
the records. That might make it more worthwhile.

In short. Yes, there are a lot of optimizations possible around partitioned
tables that we don't do either because it's not clear how to tell when they're
worthwhile or because the code just isn't there yet.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


In response to

pgsql-performance by date

Next:From: Craig A. JamesDate: 2007-04-30 16:18:51
Subject: Re: Feature Request --- was: PostgreSQL Performance Tuning
Previous:From: Tom LaneDate: 2007-04-30 15:06:47
Subject: Re: Query performance problems with partitioned tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group