Re: [HACKERS] [PATCH] kNN for SP-GiST

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] kNN for SP-GiST
Date: 2018-07-03 23:21:08
Message-ID: c6957e2c-3052-4fce-55ac-2cf7bcfd0dd9@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached 5th version of the patches, where minor refactoring of distance
handling was done (see below).

On 02.07.2018 19:06, Alexander Korotkov wrote:

> Hi!
>
> On Fri, Jun 29, 2018 at 5:37 PM Nikita Glukhov<n(dot)gluhov(at)postgrespro(dot)ru> wrote:
>> On 06.03.2018 17:30, David Steele wrote:
>>
>>> I agree with Andres. Pushing this patch to the next CF.
>> Attached 4th version of the patches rebased onto the current master.
>> Nothing interesting has changed from the previous version.
> I took a look to this patchset. In general, it looks good for me, but
> I've following notes about it.
>
> * We're going to replace scan stack with pairing heap not only for
> KNN-search, but for regular search as well. Did you verify that it
> doesn't cause regression for regular search case, because insertion
> into pairing heap might be slower than insertion into stack? One may
> say that it didn't caused regression in GiST, and that's why it
> shouldn't cause regression in SP-GiST. However, SP-GiST trees might
> be much higher and these performance aspects might be different.

I decided to bring back the List-based scan stack in the previous version
of the patches, and here is how spgAddSearchItemToQueue() looks like now:

static void
spgAddSearchItemToQueue(SpGistScanOpaque so, SpGistSearchItem *item)
{
if (so->queue)
pairingheap_add(so->queue, &item->phNode);
else
so->scanStack = lcons(item, so->scanStack);
}

so->queue is initialized in spgrescan() only for ordered searches.

> * I think null handling requires better explanation. Nulls are
> specially handled in pairingheap_SpGistSearchItem_cmp(). In the same
> time you explicitly set infinity distances for leaf nulls. You
> probably have reasons to implement it this way, but I think this
> should be explained. Also isnull property of SpGistSearchItem doesn't
> have comment.

Distances for NULL items are expected to be NULL (it would not be true for
non-strict the distance operators, so we do not seem to support them here),
and so NULL items are considered to be greater than any non-NULL items in
pairingheap_SpGistSearchItem_cmp(). Distances are copied into SpGistSearchItem
only if it is non-NULL item. Also in the last version of the patch I have
introduced spgAllocSearchItem() which conditionally allocates distance-array in
SpGistSearchItem. Now inifinity distances are used only in one place, but
if we require that innerConsistent() should always return distances, then
we can completely get rid of so->infDistances field.

> * I think KNN support should be briefly described in
> src/backed/access/spgist/README.

A minimal description written by the original author Vlad Sterzhanov
already present in README.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Export-box_fill-v05.patch text/x-patch 1.6 KB
0002-Extract-index_store_float8_orderby_distances-v05.patch text/x-patch 5.6 KB
0003-Extract-get_index_column_opclass-and-get_opclass_opfamily_and_input_type-v05.patch text/x-patch 5.1 KB
0004-Add-kNN-support-to-SP-GiST-v05.patch text/x-patch 50.8 KB
0005-Add-ordering-operators-to-SP-GiST-kd_point_ops-and-quad_point_ops-v05.patch text/x-patch 28.2 KB
0006-Add-ordering-operators-to-SP-GiST-poly_ops-v05.patch text/x-patch 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Srinivas Karthik V 2018-07-03 23:34:31 Re: Bulk Insert into PostgreSQL
Previous Message Michael Paquier 2018-07-03 22:52:48 Re: Non-reserved replication slots and slot advancing