Re: [PATCH] we have added support for box type in SP-GiST index

From: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, emre(at)hasegeli(dot)com, Oleg Bartunov <obartunov(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lebedev <a(dot)lebedev(at)postgrespro(dot)ru>
Subject: Re: [PATCH] we have added support for box type in SP-GiST index
Date: 2016-03-22 00:41:58
Message-ID: D4148745-6912-4DFC-9278-F0B7666E37A2@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 22 Mar 2016, at 01:47, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> wrote:
>
> On 3/21/16 11:57 AM, Teodor Sigaev wrote:
>> A and B are points of intersection of lines. So, box PBCAis a bounding
>> box for points contained in 3-rd (see labeling above). For example X
>> labeled point is not a descendace of child node with centroid C because
>> it must be in branch of 1-st quad of parent node. So, each node (except
>> root) will have a limitation in its quadrant. To transfer that
>> limitation the traversalValue is used.
>
> Isn't this basically the same thing that the cube contrib module does? (Which has the added benefit of kNN-capable operators).

Cube and postgres own geometric types are indexed by building R-tree. While this is one of the most popular solutions it
degrades almost to linear search when bounding boxes for each index node overlaps a lot. This problem will arise when one will
try to index streets in the city with grid system — a lot of street's bounding boxes will just coincide with bounding box for whole city.

But that patch use totally different strategy for building index and do not suffer from above problem.

>
> If that's true then ISTM it'd be better to work on pulling cube's features into box?
>
> If it's not true, I'm still wondering if there's enough commonality here that we should pull cube into core…

There is also intarray module with very common functionality, but it also addressing different data pattern.

Optimal indexing and kNN strategy are very sensible to the data itself and if we want some general multidimensional search routines,
then I think none of the mentioned extensions is general enough. Cube barely applicable when dimensions number is higher than 10-20,
intarray performs bad on data with big sets of possible coordinates, this patch is also intended to help with specific, niche problem.

While people tends to use machine learning and regressions models more and more it is interesting to have some general n-dim indexing with kNN,
but I think it is different problem and should be solved in a different way.

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-22 00:44:41 Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message Peter Geoghegan 2016-03-22 00:34:17 Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)