Re: Performance with sorting and LIMIT on partitioned table

From: Joe Uhl <joeuhl(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michal Szymanski <mich20061(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance with sorting and LIMIT on partitioned table
Date: 2009-10-20 10:31:18
Message-ID: 4ADD9176.20301@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Oct 19, 2009 at 6:58 AM, Joe Uhl <joeuhl(at)gmail(dot)com> wrote:
>> I have a similar, recent thread titled Partitioned Tables and ORDER BY with
>> a decent break down. I think I am hitting the same issue Michal is.
>>
>> Essentially doing a SELECT against the parent with appropriate constraint
>> columns in the WHERE clause is very fast (uses index scans against correct
>> child table only) but the moment you add an ORDER BY it seems to be merging
>> the parent (an empty table) and the child, sorting the results, and
>> sequential scanning. So it does still scan only the appropriate child table
>> in the end but indexes are useless.
>>
>> Unfortunately the only workaround I can come up with is to query the
>> partitioned child tables directly. In my case the partitions are rather
>> large so the timing difference is 522ms versus 149865ms.
>>
>
> These questions are all solvable depending on what you define
> 'solution' as. I would at this point be thinking in terms of wrapping
> the query in a function using dynamic sql in plpgsql...using some ad
> hoc method of determining which children to hit and awkwardly looping
> them and enforcing limit, ordering, etc at that level. Yes, it sucks,
> but it only has to be done for classes of queries constraint exclusion
> can't handle and you will only handle a couple of cases most likely.
>
> For this reason, when I set up my partitioning strategies, I always
> try to divide the data such that you rarely if ever, have to fire
> queries that have to touch multiple partitions simultaneously.
>
> merlin
>
This definitely sounds like a workable approach. I am doing something a
little similar on the insert/update side to trick hibernate into writing
data correctly into partitioned tables when it only knows about the parent.

For anyone else hitting this issue and using hibernate my solution on
the select side ended up being session-specific hibernate interceptors
that rewrite the from clause after hibernate prepares the statement.
This seems to be working alright especially since in our case the code,
while not aware of DB partitioning, has the context necessary to select
the right partition under the hood.

Thankfully we haven't yet had queries that need to hit multiple
partitions so this works okay without too much logic for now. I suppose
if I needed to go multi-partition on single queries and wanted to
continue down the hibernate interceptor path I could get more creative
with the from clause rewriting and start using UNIONs, or switch to a
Postgres-level solution like you are describing.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2009-10-20 11:55:34 Re: Domain vs table
Previous Message Merlin Moncure 2009-10-20 02:50:46 Re: Performance with sorting and LIMIT on partitioned table