Re: knngist patch support

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 21:20:25
Message-ID: m28waw4yue.fsf@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I see your point. May be it's better to introduce new system table? pg_amorderop
>> to store ordering operations for index.
>
> We could, but that approach doesn't scale to wanting more categories
> in the future --- you're essentially decreeing that every new category
> of opclass-associated operator will require a new system catalog,
> along with all the infrastructure needed for that. That guarantees
> that the temptation to take shortcuts will remain high.

On the other hand here's how the fine manual define an operator class:

Operator classes are so called because one thing they specify is the
set of WHERE-clause operators that can be used with an index (i.e.,
can be converted into an index-scan qualification). An operator class
can also specify some support procedures that are needed by the
internal operations of the index method, but do not directly
correspond to any WHERE-clause operator that can be used with the
index.

> If we didn't already have the plus/minus-for-WINDOW-RANGE example
> staring us in the face, I might think that an extensible solution
> wasn't needed here ... but we do so I think we really need to allow
> for multiple categories in some form.

Agreed.

And we're talking about the basic operators + and -, and about a
distance or metric operator. Those remind me of groups and etc.

http://en.wikipedia.org/wiki/Group_(mathematics)
http://en.wikipedia.org/wiki/Abelian_group
http://en.wikipedia.org/wiki/Ring_(mathematics)
http://en.wikipedia.org/wiki/Metric_space

A group is defined by a data type, an operator (+), and an identity
element. If the group is abelian, the given operation is associative and
commutative, and each element of the data type has an inverse.

Then there's the metric space which is a data type with a distance
function. This function must be non-negative, commutative, etc.

So I guess what we need here is a Operator Group to define our plus and
minus operators, and the fact that it's a group says (by convention,
like the total ordering of a BTree) that the + is commutative and the -
its opposite. Or we have an "option" called abelian for specifying the
commutativity?

Then I'm for tricking the maths to say that our notion of a metric space
will apply only to our notion of an Operator Group, unless someone
really insists on separating the concepts. So an Operator Group defines
2 strategies that you have to attach to your + and - operators, and a
support function which is the distance, and optional.

Now, we want to have Groups able to work on more than one datatype, for
example talking about the distance of a point to a circle or a box. So
we need an Operator Group Family, don't we?

How much does all that help in this case?
--
dim

PS: I'm sad to have this discussion after having read it's too late for
9.0. The KNN-Gist stuff with extended ORDER BY indexing was too good to
be true.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2010-02-13 21:37:41 idle in txn query cancellation
Previous Message Teodor Sigaev 2010-02-13 20:59:27 Re: knngist patch support