Skip site navigation (1) Skip section navigation (2)

Re: KNNGiST for knn-search (WIP)

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: KNNGiST for knn-search (WIP)
Date: 2009-11-26 16:34:21
Message-ID: 4B0EAE0D.606@sigaev.ru (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi!

Contrib module is reworked as a patch for current GiST. Now GiST supports 
KNN-search, the query looks like
SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
or
SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
Plans are:
EXPLAIN (COSTS OFF)
SELECT * FROM point_tbl ORDER BY f1 <-> '0,1';
                QUERY PLAN
-----------------------------------------
  Index Scan using gpointind on point_tbl
    Index Cond: (f1 <-> '(0,1)'::point)
EXPLAIN (COSTS OFF)
SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
                                   QUERY PLAN
------------------------------------------------------------------------------
  Index Scan using gpointind on point_tbl
    Index Cond: ((f1 <@ '(10,10),(-10,-10)'::box) AND (f1 <-> '(0,1)'::point))


pg_am now has new column amcanorderbyop (can order by operation), indexes with 
this flag enabled can be used to speedup operations, which returns non-boolean 
value, currently type of returned value should have default btree operator class 
  to perform sort.

Planner (find_usable_indexes function, actually) could push pathkey expression 
into restriction clauses of index. I'm not fully satisfied with this piece of 
code, but, may be, someone has a better idea. I though about adding separate 
indexorderquals in IndexPath structure...

Both GiST's get methods are optimized and there is no overhead, since gistrescan 
method can choose what traversal algorithm to use using information about types 
of values returned by operations. If at least one of them returns non-boolean 
result, then KNN-search will be performed.

The only change in interface of supporting functions is: consistentFn function 
could return float8 non-negative value and it's mandatory to perform KNN-search. 
Old-style consistent functions are supported.

Patch contains (it still requires rbtree-0.5 and point_ops-0.4 patches):
- GiST changes
- changes in point_ops to support knn-search
- contrib/pg_trgm now has new operation <-> returns distance between texts. This 
operation is supported in KNN-search
- contrib/btree_gist provides <-> and its support for GiST for types int2, int4, 
int8, float4, float8, money, oid, interval, time, date, timestamp and timestamptz

TODO:
- selectivity of ordering operation should be 1.0
- current patch remove support of  IndexScanDesc->kill_prior_tuple, it's needed 
to restore support if it will not create too big overhead
- documentation
-- 
Teodor Sigaev                                   E-mail: teodor(at)sigaev(dot)ru
                                                    WWW: http://www.sigaev.ru/

Attachment: builtin_knngist-0.4.gz
Description: application/x-tar (25.7 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Chris BrowneDate: 2009-11-26 16:36:01
Subject: Re: cvs chapters in our docs
Previous:From: Tom LaneDate: 2009-11-26 16:14:54
Subject: Re: ecpg.addons

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group