Quick Links

Re: Path question

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at>
Subject:	Re: Path question
Date:	2010-09-01 14:21:25
Message-ID:	AANLkTi=t9rC9Pj4ywBEkpgU2rJZ_-cKfA0zvaELpDi9r@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

2010/9/1 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
> we are experimenting with modifying table partitioning
> so the ORDER BY clause can be pushed down to
> child nodes on the grounds that:
> 1. smaller datasets are faster to sort, e.g. two datasets that almost
> spill out to disk are faster to sort in memory and later merge them
> than the union set that spills out to disk, or two larger sets
> that spill out to disk are faster to sort individually than the
> union dataset (because of the longer seeks, etc)

For what it's worth this logic doesn't necessarily hold. Sorting two
data sets in memory is only faster if you only need one result set at
a time. If you know the first set contains only records which come
before the second then you can do this but if not then you'll need all
the result sets to be available at the same time which will require
spilling them to disk. If you have to do that then you might as well
do a tapesort anyways.

> 2. individual child nodes can have indexes that produce
> the sorted output already

This is the big win. Currently Append nodes mean throwing out any
ordering from the child nodes and that's important information to be
losing.

> Currently I am abusing the AppendPath node but later I will add a new
> executor node that will merge the sorted input coming from the children.

You realize I already did this a few years ago. Search for "merge
append" on the lists. The hard part -- as usual -- ends up being
figuring out in the planner when to do this optimization and how to
avoid exponential growth in the number of plans considered and the
amount of data retained about each plan.

For what it's worth I disagree with Tom. I think this is a situation
where we need *both* types of solution. Ideally we will be able to use
a plain Append node for cases where we know the relative ordering of
the data in different partitions, but there will always be cases where
the structured partition data doesn't actually match up with the
ordering requested and we'll need to fall back to a merge-append node.

--
greg

In response to

Path question at 2010-09-01 13:57:12 from Boszormenyi Zoltan

Responses

Re: Path question at 2010-09-01 23:20:12 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	PostgreSQL - Hans-Jürgen Schönig	2010-09-01 14:28:06	Re: Path question
Previous Message	Tom Lane	2010-09-01 14:10:21	Re: Path question