Re: GiST kNN search queue (Re: KNN-GiST with recheck)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Arthur Silva <arthurprs(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GiST kNN search queue (Re: KNN-GiST with recheck)
Date: 2014-12-10 21:50:19
Message-ID: 5488C01B.20101@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/10/2014 10:59 PM, Arthur Silva wrote:
> It may be better to replace the lib/binaryheap altogether as it offers
> comparable/better performance.

It's not always better. A binary heap is more memory-efficient, for
starters.

There are only two uses of lib/binaryheap: reorderbuffer.c and merge
append. Reorderbuffer isn't that performance critical, although a binary
heap may well be better there, because the comparison function is very
cheap. For merge append, it might be a win, especially if the comparison
function is expensive. (That's on the assumption that the overall number
of comparisons needed with a pairing heap is smaller - I'm not sure how
true that is). That would be worth testing.

I'd love to test some other heap implementation in in tuplesort.c. It
has a custom binary heap implementation that's used in the final merge
phase of an external sort, and also when doing a so-called bounded sort,
i.e. "ORDER BY x LIMIT Y". But that would be difficult to replace,
because tuplesort.c collects tuples in an array in memory, and then
turns that into a heap. An array is efficient to turn into a binary
heap, but to switch to another data structure, you'd suddenly need extra
memory. And we do the switch when we run out of work_mem, so allocating
more isn't really an option.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2014-12-10 21:55:17 Fix typo um vacuumdb tests
Previous Message Matt Newell 2014-12-10 21:29:59 Re: libpq pipelining