Re[2]: [HACKERS] GSoC 2015: SP-GIST for geometrical objects

From: Dima Ivanovskiy <dima-iv(at)mail(dot)ru>
To: Arthur Silva <arthurprs(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re[2]: [HACKERS] GSoC 2015: SP-GIST for geometrical objects
Date: 2015-03-27 21:41:42
Message-ID: 1427492502.579728668@f337.i.mail.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>On Mar 27, 2015 11:08 AM, "Dima Ivanovskiy" < dima-iv(at)mail(dot)ru > wrote:
>>
>> Hello, I am Dmitrii, student of Moscow Institute of Physics and Technology
>>
>> Abstract:
>>
>> I chose project "Indexing prolonged geometrical objects (i.e. boxes, circles, polygons, not points) with SP-GiST by mapping to 4d-space".
>> According to the presentation
>> https://www.pgcon.org/2011/schedule/attachments/197_pgcon-2011.pdf
>> SP-GIST 3 times faster than GiST in some cases. But GIST supports geometrical data types:
>> box, circle, polygon with operators: && &> &< &<| >> << <<| <@ @> @ |&> |>> ~ ~=
>> Popular spatial extension PostGIS doesn't include SP-GIST, but has a lot of geometrical features.
>>
>> Project details:
>>
>> After meeting with Alexander Korotkov, I wrote some plan.
>> Using of K-D-tree and Quadtree in building index for geometrical data types can increase speed of search in some cases.
>> The main idea is representing 2-D geometrical objects in their bounding box. Set of 2-D boxes is 4-D space.
>> New _ops will work with points from 4-D space, for example kd_box_ops, quad_circle_ops and will support all geometrical operators.
>> After conversion object to their bounding box algo has set of tuples (x1, y1, x2, y2).
>> Our goal is separate this space the most equally. If we talk about K-D-tree, on first step K-D-tree algorithm will split space in 2 parts by the first coordinate, in next step by the second coordinate etc., after 4-th coordinate we repeat this procedure.
>> At the end we have index at geometrical objects and use traversal tree for every search operator.
>>
>> Postgresql has already has realization ideas of MBR in gist/gistproc.c. So I will transfer this realization to other type of tree.
>>
>> Of cource, I assume that SP-GIST can be not the best decision of this problem. So after testing this clear methods, I will try to find more effective way. Maybe with using combination of different spatial tree structures.
>>
>> Project Schedule:
>>
>> until May 25
>>
>> Read documentation and source code, clarify details of implementation.
>>
>> 1st month
>>
>> Implement new '_ops' with all geometrical operators for box, circle, polygon
>>
>> 2nd month
>>
>> Research new methods for increase speed of geometrical query
>>
>> 3rd month
>>
>> Final refactoring, testing and submitting a patch.
>>
>>
>> Links:
>>
>> http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html - about GIST
>> https://toster.ru/q/27135#answer_110197 - people need SP-GIST for cubes
>> http://www.slideshare.net/profyclub_ru/o-lt - presentation about indexes
>> http://pgconf.ru/static/presentations/2015/korotkov_spatial.pdf - working with geo objects
>>
>Nice proposal.
>Dynamic Kdtrees can perform badly as the splitting median can get way off as updates are coming. What are your thoughts about that?
>Also what's up with the 4d space? I don't quite get it. These types are 2 or 3 dimensions.
I read spgist README one more time . I didn't find the mechanism for maintaining good balance after updates.
I think we can use Bkd-Tree, https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf . But It can be not the best solving.
I include Research time in 2nd month of timeline.

About 4d space. All these types are 2 dimensional.
Just as i n R-tree object is approximated by MBR. MBR for 2d-objects can be mapped to 4d-point. More general, nd-object MBR can be mapped into 2nd-point.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-03-27 21:44:40 Rounding to even for numeric data type
Previous Message Alexander Korotkov 2015-03-27 19:49:53 Re: GSoC 2015 proposal. Bitmap Index-only Count