Re: [PATCH] kNN for btree

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2018-12-27 02:46:30
Message-ID: CAPpHfdt75MZncGBo5=ARVk4M2iGetwnJfO5Gs0dhhUto=TFOnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Fri, Nov 30, 2018 at 3:02 PM Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> On 29.11.2018 18:24, Dmitry Dolgov wrote:
> >> On Wed, Sep 26, 2018 at 5:41 PM Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> >>
> >> Attached 3rd version of the patches rebased onto the current master.
> >>
> >> Changes from the previous version:
> >> - Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
> >> - Added parallel kNN scan support.
> >> - amcanorderbyop() was transformed into ammatchorderby() which takes a List of
> >> PathKeys and checks each of them with new function match_orderbyop_pathkey()
> >> extracted from match_pathkeys_to_index(). I think that this design can be
> >> used in the future to support a mix of ordinary and order-by-op PathKeys,
> >> but I am not sure.
> > Hi,
> >
> > Unfortunately, the patch has some conflicts, could you rebase it? In the
> > meantime I'll move it to the next CF, hoping to have more reviewers for this
> > item.
>
> Attached 4th version of the patches rebased onto the current master.

I think this patchset in general has a good shape. After some rounds
of review, it might be committed during January commitfest.

For now, I have following notes.

* 0002-Introduce-ammatchorderby-function-v04.patch

I think match_orderbyop_pathkey() and match_orderbyop_pathkeys()
deserve some high-level commends describing what these functions are
expected to do.

* 0004-Add-kNN-support-to-btree-v04.patch

+ <para>
+ FIXME!!!
+ To implement the distance ordered (nearest-neighbor) search, we only need
+ to define a distance operator (usually it called &lt;-&gt;) with a
correpsonding
+ operator family for distance comparison in the index's operator class.
+ These operators must satisfy the following assumptions for all non-null
+ values A,B,C of the datatype:
+
+ A &lt;-&gt; B = B &lt;-&gt; A symmetric law
+ if A = B, then A &lt;-&gt; C = B &lt;-&gt; C distance equivalence
+ if (A &lt;= B and B &lt;= C) or (A &gt;= B and B &gt;= C),
+ then A &lt;-&gt; B &lt;= A &lt;-&gt; C monotonicity
+ </para>

What exactly you're going to fix here? I think you at least should
provide a proper formatting to this paragraph....

* 0006-Remove-distance-operators-from-btree_gist-v04.patch

I see you provide btree_gist--1.6.sql and remove btree_gist--1.2.sql.
Note, that in order to better checking of extension migrations, we're
now providing just migration script to new version. So, everybody
installing new version will go through the migration. However, in
this particular case we've mass deletion of former extension objects.
So, I think this case should be an exception to the rules. And it's
good to provide new version of extension script in this case. Other
opinions?

A see btree_gist--1.5--1.6.sql contains a sophisticated query
updating extension operators to builtin operators. However, what do
you think about just long sequence of ALTER OPERATOR FAMILY commands
removing old operators and adding new operators? It would be longer,
but more predictable and easier for understanding.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-12-27 02:55:08 Re: random() (was Re: New GUC to sample log queries)
Previous Message Tom Lane 2018-12-27 02:39:00 Re: random() (was Re: New GUC to sample log queries)