Re: [PATCH] kNN for btree

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2017-02-16 15:59:38
Message-ID: 8340.1487260778@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>> My idea is that we need more general redesign of specifying ordering which
>> index can produce. Ideally, we should replace amcanorder, amcanbackward and
>> amcanorderbyop with single callback. Such callback should take a list of
>> pathkeys and return number of leading pathkeys index could satisfy (with
>> corresponding information for index scan). I'm not sure that other hackers
>> would agree with such design, but I'm very convinced that we need something
>> of this level of extendability. Otherwise we would have to hack our planner
>> <-> index_access_method interface each time we decide to cover another index
>> produced ordering.

> Yeah. I'm not sure if that's exactly the right idea. But it seems
> like we need something.

That's definitely not exactly the right idea, because using it would
require the core planner to play twenty-questions trying to guess which
pathkeys the index can satisfy. ("Can you satisfy some prefix of this
pathkey list? How about that one?") It could be sensible to have a
callback that's called once per index and hands back a list of pathkey
lists that represent interesting orders the index could produce, which
could be informed by looking aside at the PlannerInfo contents to see
what is likely to be relevant to the query.

But even so, I'm not convinced that that is a better design or more
maintainable than the current approach. I fear that it will lead to
duplicating substantial amounts of code and knowledge into each index AM,
which is not an improvement; and if anything, that increases the risk of
breaking every index AM anytime you want to introduce some fundamentally
new capability in the area. Now that it's actually practical to have
out-of-core index AMs, that's a bigger concern than it might once have
been.

Also see the discussion that led up to commit ed0097e4f. Users objected
the last time we tried to make index capabilities opaque at the SQL level,
so they're not going to like a design that tries to hide that information
even from the core C code.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Devrim Gündüz 2017-02-16 16:01:48 Re: drop support for Python 2.3
Previous Message Amit Kapila 2017-02-16 15:55:51 Re: Parallel Index-only scan