Re: Merge Append Patch merged up to 85devel

From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge Append Patch merged up to 85devel
Date: 2009-07-06 00:23:50
Message-ID: 407d949e0907051723y76c417cdja43e6b4355d8aa5c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 5, 2009 at 10:34 PM, Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
> On Jul 5, 2009, at 10:02 AM, Gregory Stark <stark(at)mit(dot)edu> wrote:
>>
>> Here's a copy of the merge-append patch that I sent months ago merged up to
>> head. I haven't really added any additional functionality since then.
>
> Can you provide some more details about the objective of this patch?  Or a
> link to previous discussion?

It's one piece of the puzzle of how to deal with partitioned tables
more completely.

The basic problem is that currently the planner assumes all Append
nodes produce unordered output. That means there's no way for the
planner to use any indexes on partitions to produce a desired
ordering. That means it's impossible to do a merge join or to satisfy
an ORDER BY clause without introducing a sort even if you have
matching indexes on every partition.

Lots of common cases arise with partitioned tables where this kicks
in. Many of them will be eliminated as our partitioning support gets
more intelligent and is able to recognize how the partitioning key
relates to the requested order or collapse singleton partition scans
in cases where the parent table currently interferes. However there
will always be cases where those mechanisms to simplify the plan
sufficiently and we end up with an Append node of unordered partitions
and we need this as a back-stop to avoid awful plans.

The merge-append node works by teaching the Append node what sort
order it's aiming to produce. The append node then keeps a heap of
slots and returns the tuple from the top child plan replacing it in
the heap with the next plan.

This produces plans like this:

QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.20..489.00 rows=9600 width=4)
-> Append (cost=0.20..489.00 rows=9600 width=4)
-> Index Scan using p_pkey on p (cost=0.00..80.25 rows=2400 width=4)
-> Index Scan using p1_pkey on p1 p (cost=0.00..80.25
rows=2400 width=4)
-> Index Scan using p2_pkey on p2 p (cost=0.00..80.25
rows=2400 width=4)
-> Index Scan using p3_pkey on p3 p (cost=0.00..80.25
rows=2400 width=4)
(6 rows)

Instead of plans like this which is the best we can do today:

QUERY PLAN
--------------------------------------------------------------------------
Sort (cost=770.98..794.98 rows=9600 width=4)
Sort Key: public.p.i
-> Result (cost=0.00..136.00 rows=9600 width=4)
-> Append (cost=0.00..136.00 rows=9600 width=4)
-> Seq Scan on p (cost=0.00..34.00 rows=2400 width=4)
-> Seq Scan on p1 p (cost=0.00..34.00 rows=2400 width=4)
-> Seq Scan on p2 p (cost=0.00..34.00 rows=2400 width=4)
-> Seq Scan on p3 p (cost=0.00..34.00 rows=2400 width=4)
(8 rows)

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2009-07-06 00:28:39 WIP: generalized index constraints
Previous Message Sergey Burladyan 2009-07-05 22:36:06 Re: 8.4, One-Time Filter and subquery ( ... FROM function() union all ... )