Re: Ordered Append Node

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Florian Weimer <fweimer(at)bfk(dot)de>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ordered Append Node
Date: 2007-11-23 13:07:06
Message-ID: 4746D07A.9010000@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark wrote:
> Not quite the same since the Executor-based implementation would have a static
> tree structure based on the partitions. Even if the partitions are all empty
> except for one or two you would still have to push the result records through
> all the nodes for the empty partitions.
>
> A heap only has the next record from each input. If an input has reached eof
> then the heap doesn't have an entry for that input. So if one of the inputs is
> empty (often the parent of the inheritance tree) it doesn't require a test
> anywhere to propagate the record up past it.

Ah, so the binary tree (binary heap?) gets adjusted dynamically. Very
clever! (OTOH probably a micro optimization, but as code is already
there, use it, yeah!)

> I also did an optimization similar to the bounded-sort case where we check if
> the next tuple from the same input which last contributed the result record
> comes before the top element of the heap. That avoids having to do an insert
> and siftup only to pull out the same record you just inserted. In theory this
> is not an optimization but in practice I think partitioned tables will often
> contain contiguous blocks of key values and queries will often be joining
> against that key and therefore often want to order by it.

Hm... that assumes range partitioning, no? If you partition among three
partitions by "id modulo 3", tuples are most probably coming from one
partition after the other, i.e.:
1 2 3 1 2 3 1 2 3 ...

And with hash partitioning, you're completely unable to predict the
ordering.

> Ideally we would also be able to do this in the planner. If the planner could
> determine from the constraints that all the key values from each partition are
> non-overlapping and order them properly then it could generate a regular
> append node with a path order without the overhead of the run-time
> comparisons.

Agreed.

> But that requires a) dealing with the problem of the parent table which has no
> constraints and b) an efficient way to prove that constraints don't overlap
> and order them properly. The latter looks like an O(n^2) problem to me, though
> it's a planning problem which might be worth making slow in exchange for even
> a small speedup at run-time.

Well, I think someday, Postgres needs better support for vertical data
partitioning in general. Dealing with constraints and inheritance is way
too flexible and prone to error. I'll shortly start a new thread about
that, to outline my current thoughts about that topic.

Regards

Markus

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-11-23 13:11:44 Re: Ordered Append Node
Previous Message Csaba Nagy 2007-11-23 13:03:41 Re: Ordered Append Node