Re: SP-GiST support for inet datatypes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: emre(at)hasegeli(dot)com
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SP-GiST support for inet datatypes
Date: 2016-08-24 14:11:07
Message-ID: 15201.1472047867@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I've pushed this patch with mostly (not entirely) cosmetic adjustments.
> I still think it'd be worth looking into why the produced index is larger
> than the GiST equivalent, and for that matter exactly why the GiST
> equivalent is so much slower to search.

I spent some time poking at this, and noticed that although the picksplit
rule is guaranteed to produce a non-degenerate split unless the inputs
are all identical, it's entirely capable of producing a very bad split.
Here's some instrumentation printout showing the numbers of items assigned
to the four subnodes, for the test case we've been working with:

58 inet picksplit (227 tuples): 0 0 127 100
65 inet picksplit (227 tuples): 0 0 223 4
69 inet picksplit (225 tuples): 2 0 126 97
69 inet picksplit (227 tuples): 1 0 226 0
72 inet picksplit (227 tuples): 0 0 224 3
72 inet picksplit (227 tuples): 1 0 132 94
82 inet picksplit (226 tuples): 1 0 127 98
86 inet picksplit (227 tuples): 0 0 130 97
95 inet picksplit (227 tuples): 0 0 2 225
99 inet picksplit (227 tuples): 0 0 225 2
117 inet picksplit (227 tuples): 2 0 126 99
118 inet picksplit (227 tuples): 0 0 128 99
137 inet picksplit (227 tuples): 0 0 226 1
151 inet picksplit (227 tuples): 0 0 1 226
270 inet picksplit (227 tuples): 1 0 127 99
499 inet picksplit (122 tuples): 0 0 64 58

(This is from "sort | uniq -c" output, so the first column is the number
of occurrences of identical split counts.)

Aside from the disturbing frequency of 100-to-1 split ratios, it also
looks like the inclusion of the masklen bit is hardly pulling its weight,
though that might be a artifact of this data set.

The bad splits seem to be a direct contributor to the index's relatively
poor space efficiency; they lead to SPGiST having to move nearly all of
a long leaf chain to another page, and then soon having to split it again,
resulting in another mostly-empty page, lather rinse repeat. They can't
be very helpful for search speed either.

I think that it'd be worth investigating changing to a split rule that
uses the next two or three or four bits of the address, not just the
single next bit, to index into more than four subnodes. It's pretty clear
how to adjust the insertion functions to support that, and a crude hack in
that line suggested that the index does get noticeably smaller. However,
I didn't quite grok how to adjust the search (consistent) functions.
Obviously we could apply the same rules in a loop, considering each
successive address bit, but there may be a better way.

Even though I already committed the code, we're a very long way from
v10 release, so I see no reason not to consider incompatible changes
in the way this opclass works.

I also noticed that the index build wastes some space because SPGIST
remembers only one partly-full page within each of its page categories.
A prototype patch to enlarge the size of that cache helped a good bit
too. I'll clean that up and post it later, but I was hoping you'd
work on improving this opclass's split rules.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2016-08-24 14:15:59 Re: pg_dump with tables created in schemas created by extensions
Previous Message Tom Lane 2016-08-24 13:42:56 Re: Showing parallel status in \df+