Re: [PATCH] kNN for btree

From: David Steele <david(at)pgmasters(dot)net>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2017-03-02 14:57:50
Message-ID: e3287c0c-9fa6-7f9c-f0d4-51cc94bd30e5@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce. Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback. Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan). I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability. Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah. I'm not sure if that's exactly the right idea. But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy. ("Can you satisfy some prefix of this
>> pathkey list? How about that one?") It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach. I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area. Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
>
> Yeah, that's all true. But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either. The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
>
>> Also see the discussion that led up to commit ed0097e4f. Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
>
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF. There is also the
need for a new patch and a general consensus of how to proceed.

I recommend moving this patch to 2017-07 or marking it RWF.

Thanks,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-02 15:04:44 Re: delta relations in AFTER triggers
Previous Message Tomas Vondra 2017-03-02 14:53:45 Re: multivariate statistics (v25)