Quick Links

Re: KNNGiST for knn-search

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: KNNGiST for knn-search
Date:	2009-11-24 12:37:29
Message-ID:	4B0BD389.5000509@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Teodor Sigaev wrote:
>>> 1. KNNGiST is about 5% slower than GiST on non-knn search queries, like
>>> contains or contained by, because of some overhead of new algorithm of
>>> tree traversal
>>
>> Is it possible to use the regular GiST traversal algorithm on a
>> KNNGiST-tree, when performing regular GiST searches that don't require a
>> particular order?
> New algorithm works much more with memory for allocation/free to manage
> lists and it's a single reason of performance loss. Choosing of
> algorithm could not be done by consistent function, it should be done at
> least in amrescan method or even earlier - in planner.

Ok, that sounds good. The bottom line is that you can use the same
on-disk tree with both algorithms. No need for a separate indexam in
that case.

> One idea:
> SELECT p FROM pt WHERE p << '5.0,5.0'::point ORDER BY (p <->
> '5.0,5.0'::point) DESC LIMIT 10;
> And add <-> to opclass (but for now any indexable operation should
> return boolean type).

You really shouldn't need to have a WHERE clause.

> Of course, KNNGiST should be modified to support
> not only k-nearest search but k-"farest" search and NULLS LAST/FIRST.

Well, as long as the planner knows the capabilities of the indexam, it
can just fall back to a seqscan+sort if the query can't be sped up with
the index.

> And now you can specify p >< 'one point' AND p >< 'another
> point', but it's impossible to do that by ORDER BY clause.

Huh, what does that mean? Is it like "ORDER BY (min( p >< 'one point', p
>< 'another point')" ?

> Second idea with non-standard syntax.
> SELECT ... ORDER BY PROXIMITY OF expression[, expression [..]] TO
> expression[, expression [..]] USING [operator [, operator [..]]
> and operator is distance operator, i.e. it's not a member of btree
> opclass, but returns non-negative float8 value.
>
> Without index it will be essentially the same as
> ORDER BY expression operator expression[ + ..] DESC NULLS LAST

We already have the syntax to represent the query, using ORDER BY. IMHO
we just need to teach the planner that when it sees a query like that,
it can use a GiST index to speed it up. A number of indexam and operator
class API changes are probably required, but it should be invisible to
the user.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Re: KNNGiST for knn-search at 2009-11-24 11:49:40 from Teodor Sigaev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Farina	2009-11-24 13:00:30	Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Previous Message	Pavel Stehule	2009-11-24 12:37:18	Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION