Re: [PATCH] kNN for btree

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] kNN for btree
Date: 2019-01-11 13:01:51
Message-ID: 146f4c47-cc52-a06d-343f-d152a20524e3@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Attached 5th version of the patches.

On 11.01.2019 2:21, Alexander Korotkov wrote:

> Hi!
>
> I've couple more notes regarding this patch.
> 1) There are two loops over scan key determining scan strategy:
> existing loop in _bt_first(), and in new function
> _bt_select_knn_search_strategy(). It's kind of redundant that we've
> to process scan keys twice for knn searches. I think scan keys
> processing should be merged into one loop.

This redundant loop was removed and code from _bt_select_knn_search_strategy()
was moved into the new function _bt_emit_scan_key() extracted from
_bt_preprocess_keys().

> 2) We're not supporting knn ordering only using the first key.
> Supporting multiple knn keys would require significant reword of
> B-tree traversal algorithm making it closer to GiST and SP-GiST. So,
> I think supporting only one knn key is reasonable restriction, at
> least for now. But we could support single-key knn ordering in more
> cases. For instance, knn search in "SELECT * FROM tbl WHERE a =
> const1 ORDER BY b <-> const2" could benefit from (a, b) B-tree index.
> So, we can support knn search on n-th key if there are equality scan
> keys for [1, n-1] index keys.
I will try to implement this in the next version of the patch.

> I also note that you've removed implementation of distance functions
> from btree_gist. But during pg_upgrade extensions are moved "as is".
> Not just CREATE EXTENSION command is dumped, but the whole extension
> content. pg_upgrade'd instances would have old version of extension
> metadata with new .so until ALTER EXTENSION UPDATE. So, user would get
> errors about missed function in .so until updates the extension.
>
> We're typically evade this by inclusion of old functions into new .so.
> Then user can work normally before extension update. In this
> particular case, we can leave the distance functions in the .so, but
> make them just wrappers over core functions.

Wrappers over core functions were left in btree_gist.

> I've run regression tests with patch applied and opr_sanity showed some errors:
>
> 1) date_dist_timestamptz(), timestamp_dist_timestamptz(),
> timestamptz_dist_date(), timestamptz_dist_timestamp() should be
> stable, not immutable. These functions use timezone during
> conversion.
>
Fixed.
> 2) date_dist_timestamp(), date_dist_timestamptz(),
> timestamp_dist_date(), timestamp_dist_timestamptz(),
> timestamptz_dist_date(), timestamptz_dist_timestamp() should be not
> leafproof. These functions perform conversion, which might fail in
> corner case. So, this error should be considered as a leak.

All new distance functions except oiddist() are not leakproof,
so I had to relax condition in opr_sanity.sql test:

- pp.proleakproof != po.proleakproof
+ (NOT pp.proleakproof AND po.proleakproof))

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Fix-get_index_column_opclass-v05.patch text/x-patch 1.6 KB
0002-Introduce-ammatchorderby-function-v05.patch text/x-patch 21.8 KB
0003-Extract-structure-BTScanState-v05.patch text/x-patch 43.7 KB
0004-Add-kNN-support-to-btree-v05.patch text/x-patch 64.5 KB
0005-Add-btree-distance-operators-v05.patch text/x-patch 81.4 KB
0006-Remove-distance-operators-from-btree_gist-v05.patch text/x-patch 102.1 KB
0007-Add-regression-tests-for-kNN-btree-v05.patch text/x-patch 29.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-01-11 14:09:45 Re: port of INSTALL file generation to XSLT
Previous Message Amit Langote 2019-01-11 13:00:24 Re: speeding up planning with partitions