Re: rangetypes spgist questions/refactoring

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: rangetypes spgist questions/refactoring
Date: 2014-05-20 18:18:29
Message-ID: 1400609909.3741.61.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2014-05-20 at 09:52 -0700, Jeff Davis wrote:
> 2. Why would any tuple have 2 nodes? If there are some non-empty ranges,
> why not make a centroid and have 4 or 5 nodes?

This is slightly more complicated than I thought, because we need to do
something about the root node if a bunch of empty ranges are inserted
first.

SpgSplitTuple seems to offer a way to handle the first non-empty range
inserted. Unfortunately, this limitation seems to kill that idea:

"This new prefix value must be sufficiently less restrictive than the
original to accept the new value to be indexed, and it should be no
longer than the original prefix."

because there's no good way to know how large the root's prefix might
eventually be.

So we might be better off (not a proposal; this would break upgrade)
saying that the root always has two nodes:
* node0 points to all empty ranges, which all live at level 1 and have
allTheSame set.
* node1 points to all non-empty ranges, and every tuple in that
subtree has a prefix (centroid) and 4 nodes (one for each quadrant)

I am starting to see the current implementation as an optimization this
idea where the root can also have a centroid if you can find one, which
can save an extra level in the tree search.

If my analysis is correct so far, and the assertions are correct, my
proposal is something like:

* remove dead code
* refactor to make the invariants a little more clear
* make the special case of the root tuple more clear
* improve comments describing tree structure and add assertions

I think this can be done without breaking upgrade compatibility, because
I think the structure already satisfies the invariants I mentioned in
the other email (aside from the special case of a root tuple with two
nodes and no prefix). That being said, it's a little scary to refactor
indexing code while trying to keep it upgrade-compatible.

Thoughts?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-05-20 18:23:30 jsonb failed assertions
Previous Message Bruce Momjian 2014-05-20 18:09:39 Re: 9.4 release notes