Re: knngist - 0.8

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: knngist - 0.8
Date: 2011-03-01 17:58:36
Message-ID: 4D6D33CC.9060802@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I did a quick look at this patch. The major problem with it is of
> course that it needs to be fixed for the recent extension-related
> changes. I transposed the .sql.in changes into additions to
> btree_gist--1.0.sql (attached), but haven't really sanity-checked
> them beyond checking that the regression tests pass. The same mods
> would need to be made in btree_gist--unpackaged--1.0.sql.

Fixed

> 1. oid_dist() returns oid ... really? Oid is unsigned. I'd be inclined
> to argue though that distance between Oids is a meaningless concept, so
Hmm, oid is often used as unsigned int.

> you should remove this not just mess with the result type. Anybody who
> actually wants to form a distance between Oids should have to cast them
> to an arithmetic type first. Let the user figure out how wraparound
> cases should be handled.

Distance between unsigned 32-bit integers could not be more than 2^32.

>
> 2. Beyond that, none of the distance routines have given any thought to
> avoiding overflow. For instance, dist_int2 had better return something
> wider than int2, and so on up. It looks to me like the internal gist
Just like other operations:
# select 32000::smallint + 32000::smallint;
ERROR: smallint out of range

> distance functions also suffer overflow risks, in that they tend to form
> the difference first (in the source datatype) and only afterwards cast
> to float8.
fixed

> 3. I was surprised that there wasn't a distance implementation for
> numeric. I suppose that this might be difficult to do without risking
> overflow in conversion to float8, though.
Exactly

> 4. I didn't much care for changing the result type of gbt_num_consistent
> from bool to float8; that's just messy, and I don't see any compensating
> advantage. I suggest you leave gbt_num_consistent and its callers
> alone, and add a separate gbt_num_distance routine that only handles the
> KNNDistance case.
Done

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
builtin_knngist_contrib_btree_gist-0.12.gz application/x-tar 10.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-01 18:15:34 Is the attribute options cache actually worth anything?
Previous Message Kevin Grittner 2011-03-01 17:43:58 Re: SSI bug?