Re: [PATCH] we have added support for box type in SP-GiST index

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Lebedev <a(dot)lebedev(at)postgrespro(dot)ru>
Subject: Re: [PATCH] we have added support for box type in SP-GiST index
Date: 2016-03-24 14:56:59
Message-ID: CAE2gYzxwWbTcVrAgz_BGw2iXGWtWveJ-fZURwgmF8GMis0uTDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> + * boxtype_spgist.c

The names on the file header need to be changed, too.

> I'll try to explain with two-dimensional example over points. ASCII-art:
> |
> |
> 1 | 2
> |
> -----------+-------------
> |P
> 3 | 4
> |
> |
>
> '+' with 'A' represents a centroid or, other words, point which splits plane
> for 4 quadrants and it strorend in parent node. 1,2,3,4 are labels of
> quadrants, each labeling will be the same for all pictures and all
> centriods, and i will not show them in pictures later to prevent too
> complicated images. Let we add C - child node (and again, it will split
> plane for 4 quads):
>
>
> | |
> ----+---------+---
> X |B |C
> | |
> -----------+---------+---
> |P |A
> | |
> |
> |
>
> A and B are points of intersection of lines. So, box PBCAis a bounding box
> for points contained in 3-rd (see labeling above). For example X labeled
> point is not a descendace of child node with centroid C because it must be
> in branch of 1-st quad of parent node. So, each node (except root) will have
> a limitation in its quadrant. To transfer that limitation the traversalValue
> is used.
>
> To keep formatting I attached a txt file with this fragment.

Thank you for the explanation. Should we incorporate this with the patch.

>>> + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development
>>> Group
>>
>> 2016.
>
> fixed

Not on the patch.

> + cmp_double(const double a, const double b)

Does this function necessary? We can just compare the doubles.

> + boxPointerToRangeBox(BOX *box, RangeBox * rectangle)

The "rectangle" variable in here should be renamed.

> + Assert(is_infinite(b) == 0);

This is failing on my test. You can reproduce with the script I have sent.

>> Wouldn't it be simpler to palloc and return the value on those functions?
>
> evalRangeBox() initializes part of RectBox, so, it could not palloc its
> result.
> Other methods use the same signature just for consistency.

I couldn't understand it. evalRangeBox() can palloc and return the
result. evalRectBox() can return the result palloc'ed by
evalRangeBox(). The caller wouldn't need to palloc anything.

>> Many variables are defined with "const". I am not sure they are all
>> that doesn't change. If it applies to the pointer, "out" should also
>> not change in here. Am I wrong?
>
> No, changes

I now read about "const". I am sorry for not being experienced in C.
The new version of the patch marks them as "const" by mistake.

I went through all other variables:

> + int r = is_infinite(a);
>
> + double x = *(double *) a;
> + double y = *(double *) b;
>
> + spgInnerConsistentIn *in = (spgInnerConsistentIn *) PG_GETARG_POINTER(0);
>
> + spgLeafConsistentIn *in = (spgLeafConsistentIn *) PG_GETARG_POINTER(0);
>
> + BOX *leafBox = DatumGetBoxP(in->leafDatum);

Shouldn't they be "const", too?

>>> + /*
>>> + * Begin. This block evaluates the median of coordinates of boxes
>>> + */
>>
>> I would rather explain what the function does on the function header.
>
> fixed

The "end" part of it is still there.

>> Do we really need to copy the traversalValues on allTheSame case.
>> Wouldn't it work if just the same value is passed for all of them.
>> The search shouldn't continue after allTheSame case.
>
> Seems, yes. spgist tree could contain a locng branches with allTheSame.

Does SP-GiST allows any node under allTheSame to not being allTheSame?
Not setting traversalValues for allTheSame worked fine with my test.

> + if (in->allTheSame)

Most of the things happening before this check is not used in there.
Shouldn't we move this to the top of the function?

> + out->nodeNumbers = (int *) palloc(sizeof(int) * in->nNodes);

This could go before allTheSame block.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-03-24 15:04:31 Re: NOT EXIST for PREPARE
Previous Message Tom Lane 2016-03-24 14:31:57 Re: PostgreSQL 9.6 behavior change with set returning (funct).*