Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)
Date: 2016-09-11 13:28:38
Message-ID: 19f05353-9514-9139-619a-dbaa7a0e5e78@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/10/2016 03:22 AM, Claudio Freire wrote:
> Overall, however, I believe the patch is in good shape. Only minor
> form issues need to be changed, the functionality seems both desirable
> and ready.

Pushed this "displace root" patch, with some changes:

* I renamed "tuplesort_heap_siftup()" to "tuplesort_delete_top()". I
realize that this is controversial, per the discussion on the "Is
tuplesort_heap_siftup() a misnomer?" thread. However, now that we have a
new function, "tuplesort_heap_replace_top()", which is exactly the same
algorithm as the "delete_top()" algorithm, calling one of them "siftup"
became just too confusing. If anything, the new "replace_top"
corresponds more closely to Knuth's siftup algorithm; delete-top is a
special case of it. I added a comment on that to replace_top. I hope
everyone can live with this.

* Instead of "root_displace", I used the name "replace_top", and
"delete_top" for the old siftup function. Because we use "top" to refer
to memtuples[0] more commonly than "root", in the existing comments.

* I shared the code between the delete-top and replace-top. Delete-top
now calls the replace-top function, with the last element of the heap.
Both functions have the same signature, i.e. they both take the
checkIndex argument. Peter's patch left that out for the "replace"
function, on performance grounds, but if that's worthwhile, that seems
like a separate optimization. Might be worth benchmarking that
separately, but I didn't want to conflate that with this patch.

* I replaced a few more siftup+insert calls with the new combined
replace-top operation. Because why not.

Thanks for the patch, Peter, and thanks for the review, Claudio!

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Convey 2016-09-11 13:47:10 Re: C++ port of Postgres
Previous Message Michael Paquier 2016-09-11 13:06:34 Re: Useless dependency assumption libxml2 -> libxslt in MSVC scripts