Re: [QUESTION/PROPOSAL] loose quadtree in spgist

From: Peter Griggs <petergriggs33(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [QUESTION/PROPOSAL] loose quadtree in spgist
Date: 2020-01-16 05:32:34
Message-ID: CACEwj4rq3X2PjV5_6gveAhj0QSso=C64kEwprzg--Vq7OCf0nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

As an update, you were totally right Tom, SPGIST loads all of the tuples
onto the root page which doesn't call picksplit until there's a page's
worth of data. I just wasn't inserting enough tuples to see my elog values
appear in the log, but now I am and they do!

The hint from Tomas before to use the CREATE OPERATOR CLASS command was
spot on. That documentation lead me to this page (
https://www.postgresql.org/docs/11/xindex.html), which looked like the sql
I need to include in the extension to get the new loose quadtree index to
build. What I did was create a "loose_quadtree" folder for my extension in
the /contrib folder and followed the format of another extension by
including a Makefile, loose_quadtree.control file, loose_quadtree.sql, and
my loose_quadtree.c file, in which I just put copies of the box quadtree
functions from geo_spgist.c with a bit of extra logging code. Now, I can
build the extension with make and then run 'CREATE EXTENSION
loose_quadtree;', then index using 'spgist(b loose_quadtree_ops)' operator
class! Now, I have to actually figure out how to change the logic within
the functions from geo_spgist.c.

I was wondering if you had advice on how to handle implementing insertion
for the loose quadtree index. For some background, a loose quadtree is
similar to a quadtree over boxes, except that the length of a side becomes
k*L where k>1. Throughout this, I assume that our space is a square (take
the minimum bounding square of all of the boxes). Usually, a value of K=2
is used. Since, each loose quadtree cell is 2x its normal size, a given
level can hold any object that has a radius of <=1/4 of the cell side
length, regardless of the object's position. We can do a bit of math and
figure out what level an object should be inserted into the tree in O(1)
time. I'm including a picture of the level selection algorithm below, but
its just a re-formulation of what i've said above. My overall question is
how to do this in spgist. From what I understand in the spgist insertion
algorithm, the level selection would should done in the choose() function
because choose() is called when we are trying to insert a leaf tuple into a
inner tuple that has one or more levels under it. Currently, it seems like
the spg_box_quad_choose() function descends recursively into a quadtree
node. What I would like to do is to have it jump straight to the level it
wants to insert into using the loose quadtree level selection algorithm and
then find which cell it should add to by comparing its center coordinates.

[Following image from: Thatcher Ulrich. Loose octrees. In Mark DeLoura,
editor, Game Programming Gems, pages 444–453. Charles River Media, 2000]

[image: Screen Shot 2020-01-14 at 6.15.09 PM.png]

Best,
Peter

On Wed, Jan 8, 2020 at 3:07 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Peter Griggs <petergriggs33(at)gmail(dot)com> writes:
> > In the getQuadrant function in the file
> src/backend/utils/adt/geo_spgist.c,
> > I only added some elog statements to see the quadrant that a box is
> placed
> > into using the current code. getQuadrant is called several times by the
> > spg_box_quad_picksplit function, which is used when inserting into the
> > quadtree. With this change, I can still build postgres but when I try to
> > trigger the code, nothing gets printed to my logfile.
>
> Perhaps you're looking in the wrong logfile. elog(LOG) should definitely
> produce output unless you're using very strange settings.
>
> Another possibility is that the specific test case you're using doesn't
> actually reach this function. I'm not totally sure, but I think that
> SPGiST might not call the datatype-specific choose or picksplit functions
> until it's got more than one index page's worth of data.
>
> > elog(LOG, "BOX (minx, miny) = (%d, %d)\n", centroid->low.x,
> centroid->low.y);
>
> A couple thoughts here:
>
> * From memory, the x and y values of a BOX are float8, so don't you want
> to use %g or %f instead of %d?
>
> * You don't want to end an elog with \n, that'll just add extra blank
> lines to the log.
>
> regards, tom lane
>

--
Peter Griggs
Masters of Engineering (Meng) in Computer Science
Massachusetts Institute of Technology | 2020

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-16 05:40:31 Re: TRUNCATE on foreign tables
Previous Message Kyotaro Horiguchi 2020-01-16 05:20:57 Re: [HACKERS] WAL logging problem in 9.4.3?