Re: [PATCH] kNN for btree

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2017-02-16 16:20:19
Message-ID: CA+TgmobUh1=Dcft=EHr4syetaMopaeE8r7T=wMqUsOAxKduaaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>> My idea is that we need more general redesign of specifying ordering which
>>> index can produce. Ideally, we should replace amcanorder, amcanbackward and
>>> amcanorderbyop with single callback. Such callback should take a list of
>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>> corresponding information for index scan). I'm not sure that other hackers
>>> would agree with such design, but I'm very convinced that we need something
>>> of this level of extendability. Otherwise we would have to hack our planner
>>> <-> index_access_method interface each time we decide to cover another index
>>> produced ordering.
>
>> Yeah. I'm not sure if that's exactly the right idea. But it seems
>> like we need something.
>
> That's definitely not exactly the right idea, because using it would
> require the core planner to play twenty-questions trying to guess which
> pathkeys the index can satisfy. ("Can you satisfy some prefix of this
> pathkey list? How about that one?") It could be sensible to have a
> callback that's called once per index and hands back a list of pathkey
> lists that represent interesting orders the index could produce, which
> could be informed by looking aside at the PlannerInfo contents to see
> what is likely to be relevant to the query.
>
> But even so, I'm not convinced that that is a better design or more
> maintainable than the current approach. I fear that it will lead to
> duplicating substantial amounts of code and knowledge into each index AM,
> which is not an improvement; and if anything, that increases the risk of
> breaking every index AM anytime you want to introduce some fundamentally
> new capability in the area. Now that it's actually practical to have
> out-of-core index AMs, that's a bigger concern than it might once have
> been.

Yeah, that's all true. But I think Alexander is right that just
adding amcandoblah flags ad infinitum doesn't feel good either. The
interface isn't really arm's-length if every new thing somebody wants
to do something new requires another flag.

> Also see the discussion that led up to commit ed0097e4f. Users objected
> the last time we tried to make index capabilities opaque at the SQL level,
> so they're not going to like a design that tries to hide that information
> even from the core C code.

Discoverability is definitely important, but first we have to figure
out how we're going to make it work, and then we can work out how to
let users see how it works.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-02-16 16:32:19 Re: duplicate "median" entry in doc
Previous Message Peter Geoghegan 2017-02-16 16:17:43 Re: Partitioning vs ON CONFLICT