Is tuplesort_heap_siftup() a misnomer?

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Is tuplesort_heap_siftup() a misnomer?
Date: 2016-08-12 23:30:51
Message-ID: CAM3SWZQ+2gJMNV7ChxwEXqXopLfb_FEW2RfEXHJ+GsYF39f6MQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Doesn't tuplesort_heap_siftup() actually shift-down?

The Wikipedia article on heaps [1] lists "shift-down" (never "sift
down", FWIW) as a common operation on a heap:

"shift-down: move a node down in the tree, similar to shift-up; used
to restore heap condition after deletion or replacement."

This seems to be what tuplesort_heap_siftup() actually does (among
other things; its job is to compact the heap following caller
returning the root, where caller leaves a "hole" at the root position
0). As it happens, the node that this tuplesort.c function "moves down
in the tree" is the memtuple that was previously positioned physically
last in the heap portion of the memtuples array. This node is then
considered as a "candidate root" (i) for each subheap as we actually
shift down, starting from i = 0. Conceptually, we immediately "fill
the hole" that caller left in the root with this other tuple (the
physically last tuple), with this new tuple/node, then shifted down as
needed. (I say "conceptually" because we don't actually bother with an
eager swap into the "hole" position, but that's just to avoid a
potentially useless swap.)

Now, tuplesort_heap_insert() actually does sift (or shift up, if you
prefer), which is necessary because it doesn't assume that there is an
existing "hole" anywhere in the heap (it just assumes the availability
of space to insert the caller's tuple, at the end of the heap's
memtuple portion).

Or, as Wikipedia describes it:

"shift-up: move a node up in the tree, as long as needed; used to
restore heap condition after insertion. Called "sift" because node
moves up the tree until it reaches the correct level, as in a sieve."

I think that tuplesort_heap_insert() is correct, because it doesn't
claim to shift at all (or sift, or whatever). I'm just contrasting it
with tuplesort_heap_siftup() to make a point.

I think that tuplesort_heap_siftup() should probably be renamed to
describe its purpose at a higher level, rather than how it shifts,
which in any case is an implementation detail. Specifically, I suggest
we rename tuplesort_heap_siftup() to tuplesort_heap_compact(), which
suggests that it's roughly the opposite of tuplesort_heap_insert(). I
think that the contrast between these two functions add clarity.
tuplesort_heap_siftup() comments will also need to be updated, since
"sift up" is mentioned there.

Should I write a patch?

[1] https://en.wikipedia.org/wiki/Heap_(data_structure)
--
Peter Geoghegan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2016-08-12 23:59:51 Re: Add hint for function named "is"
Previous Message Alvaro Herrera 2016-08-12 23:15:27 Pluggable storage