Skip site navigation (1) Skip section navigation (2)

Re: Performance with sorting and LIMIT on partitioned table

From: Joe Uhl <joeuhl(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michal Szymanski <mich20061(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance with sorting and LIMIT on partitioned table
Date: 2009-10-20 10:31:18
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Mon, Oct 19, 2009 at 6:58 AM, Joe Uhl <joeuhl(at)gmail(dot)com> wrote:
>> I have a similar, recent thread titled Partitioned Tables and ORDER BY with
>> a decent break down.  I think I am hitting the same issue Michal is.
>> Essentially doing a SELECT against the parent with appropriate constraint
>> columns in the WHERE clause is very fast (uses index scans against correct
>> child table only) but the moment you add an ORDER BY it seems to be merging
>> the parent (an empty table) and the child, sorting the results, and
>> sequential scanning.  So it does still scan only the appropriate child table
>> in the end but indexes are useless.
>> Unfortunately the only workaround I can come up with is to query the
>> partitioned child tables directly.  In my case the partitions are rather
>> large so the timing difference is 522ms versus 149865ms.
> These questions are all solvable depending on what you define
> 'solution' as.  I would at this point be thinking in terms of wrapping
> the query in a function using dynamic sql in plpgsql...using some ad
> hoc method of determining which children to hit and awkwardly looping
> them and enforcing limit, ordering, etc at that level.  Yes, it sucks,
> but it only has to be done for classes of queries constraint exclusion
> can't handle and you will only handle a couple of cases most likely.
> For this reason, when I set up my partitioning strategies, I always
> try to divide the data such that you rarely if ever, have to fire
> queries that have to touch multiple partitions simultaneously.
> merlin
This definitely sounds like a workable approach.  I am doing something a 
little similar on the insert/update side to trick hibernate into writing 
data correctly into partitioned tables when it only knows about the parent.

For anyone else hitting this issue and using hibernate my solution on 
the select side ended up being session-specific hibernate interceptors 
that rewrite the from clause after hibernate prepares the statement.  
This seems to be working alright especially since in our case the code, 
while not aware of DB partitioning, has the context necessary to select 
the right partition under the hood.

Thankfully we haven't yet had queries that need to hit multiple 
partitions so this works okay without too much logic for now.  I suppose 
if I needed to go multi-partition on single queries and wanted to 
continue down the hibernate interceptor path I could get more creative 
with the from clause rewriting and start using UNIONs, or switch to a 
Postgres-level solution like you are describing.

In response to

pgsql-performance by date

Next:From: Merlin MoncureDate: 2009-10-20 11:55:34
Subject: Re: Domain vs table
Previous:From: Merlin MoncureDate: 2009-10-20 02:50:46
Subject: Re: Performance with sorting and LIMIT on partitioned table

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group