Re: [HACKERS] [PATCH] kNN for SP-GiST

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] kNN for SP-GiST
Date: 2018-07-17 11:42:36
Message-ID: 944a7113-1886-ee4e-b8e6-df3e4dbca02e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached 7th version of the patches:
* renamed recheck fields and variables
* fixed formatting of the one if-statement

On 15.07.2018 14:54, Andrey Borodin wrote:

>> 14.07.2018, 1:31, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
>>
>> Attached 6th version of the patches.
>> ...
>>> 2. The patch leaves contribs intact. Do extensions with sp-gist opclasses
>>> need to update it's behavior somehow to be used as-is? Or to support new
>>> functionality?
>> There are no SP-GiST opclasses in contrib/, so there is nothing to change in
>> contrib/. Moreover, existing extensions need to be updated only for support of
>> new distance-ordered searches -- they need to implement distances[][] array
>> calculation in their spgInnerConsistent() and spgLeafConsistent() support
>> functions.
> So, if extension will not update anything - it will keep all preexisting functionality, right?

Yes, of course.

> I have some more nitpicks about naming
> + recheckQual = out.recheck;
> + recheckDist = out.recheckDistances;
> Can we call this things somehow in one fashion?

I would prefer to rename "spgLeafConsitentOut.recheck" to "recheckQual" but it
is not good to rename user-visible fields, so I decided to rename all fields
and variables to "recheck"/"recheckDistances".

> Also, this my be a stupid question, but do we really need to differ this two
> recheck cases? It is certain that we cannot merge them?

This recheck cases are absolutely different.

In the first case, opclass support functions can not accurately determine
whether the leaf value satisfies the boolean search operator (compressed
values can be stored in leaves).

In the second case, opclass returns only a approximate value (the lower bound)
of the leaf distances.

In the example below operator 'polygon >> polygon' does not need recheck because
bounding box (they are stored in the index instead of polygons) test gives exact
results, but the ordering operator 'polygon <-> point' needs recheck:

SELECT * FROM polygons
WHERE poly >> polygon(box '((0,0),(1,1))')
ORDER BY poly <-> point '(0,0)';

> This seems strange if-formatting
> + /* If allTheSame, they should all or none of 'em match */
> + if (innerTuple->allTheSame)
> + if (out.nNodes != 0 && out.nNodes != nNodes)
> + elog(ERROR, "inconsistent inner_consistent results for allTheSame inner tuple");
> +
> + if (out.nNodes) // few lines before you compare with 0

Fixed.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Export-box_fill-v07.patch text/x-patch 1.6 KB
0002-Extract-index_store_float8_orderby_distances-v07.patch text/x-patch 5.5 KB
0003-Extract-get_index_column_opclass-and-get_opclass_opfamily_and_input_type-v07.patch text/x-patch 5.1 KB
0004-Add-kNN-support-to-SP-GiST-v07.patch text/x-patch 50.6 KB
0005-Add-ordering-operators-to-SP-GiST-kd_point_ops-and-quad_point_ops-v07.patch text/x-patch 28.2 KB
0006-Add-ordering-operators-to-SP-GiST-poly_ops-v07.patch text/x-patch 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Klychkov 2018-07-17 11:48:36 Re[2]: Alter index rename concurrently to
Previous Message Robert Haas 2018-07-17 11:30:40 Re: Add SKIP LOCKED to VACUUM and ANALYZE