Re: [PATCH] kNN for btree

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2020-03-16 13:17:37
Message-ID: CAPpHfdsj1xq4h1t=PROTMacnr=iwgimqbSTjBUMBkmRv_GjR0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 4, 2020 at 2:39 PM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> On Wed, Mar 4, 2020 at 4:58 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Mon, Mar 2, 2020 at 1:27 PM Alexander Korotkov
> > <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > > I've rebased the patchset to the current master and made some
> > > refactoring. I hope it would be possible to bring it to committable
> > > shape during this CF. This need more refactoring though.
> >
> > This patch doesn't change anything about the B-Tree on-disk format -- right?
>
> Yes, this is correct. No on-disk format changes, just new scanning strategy.

After another try to polish this patch I figured out that the way it's
implemented is unnatural. I see the two reasonable ways to implement
knn for B-tree, but current implementation matches none of them.

1) Implement knn as two simultaneous scans over B-tree: forward and
backward. It's similar to what current patchset does. But putting
this logic to B-tree seems artificial. What B-tree does here is still
unidirectional scan. On top of that we merge results of two
unidirectional scans. The appropriate place to do this high-level
work is IndexScan node or even Optimizer/Executor (Merge node over to
IndexScan nodes), but not B-tree itself.
2) Implement arbitrary scans in B-tree using priority queue like GiST
and SP-GiST do. That would lead to much better support for KNN. We
can end up in supporting interesting cases like "ORDER BY col1 DESC,
col2 <> val1, col2 ASC" or something. But that's requires way more
hacking in B-tree core.

So, I'm marking this patch RWF. We should try re-implement this for v14.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-03-16 13:23:45 Re: Libpq support to connect to standby server as priority
Previous Message Julien Rouhaud 2020-03-16 13:15:22 Re: Online checksums verification in the backend