Re: SP-GiST failing to complete SP-GiST index build

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SP-GiST failing to complete SP-GiST index build
Date: 2018-05-28 04:38:55
Message-ID: 6A4336E2-2DC9-403E-84F9-FBE3AABCA19D@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On May 27, 2018, at 8:24 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, May 27, 2018 at 5:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Instrumenting the test case suggests that getQuadrant pretty much always
>> returns 1, resulting in a worst-case unbalanced SPGiST tree. I think this
>> is related to the fact that the test case inserts the values in increasing
>> order, so that new values are always greater than existing values in the
>> index.
>
> I suspected the same. It reminded me of the weird behavior that the
> Postgres qsort() sometimes exhibits.
>
>> SPGiST is unable to rebalance its tree on the fly, so it's pretty
>> well screwed in this situation. It does finish eventually, but in about
>> 50x longer than GiST. I imagine the index's query performance would be
>> equally awful.
>
> Can you think of some way of side-stepping the issue? It's unfortunate
> that SP-GiST is potentially so sensitive to input order.

To help with the testing, I’ve attached two more scenarios, labeled
“good2” and “bad2” below. The premise is similar, except that I start
with empty tables with indexes already created.

The workload in “bad2” is what you may see in the real world with
proper DBA planning (i.e. I have my indexes in place before I start
collecting data) with scheduling applications or anything with an increasing
time series.

The timing results I found were similar to the initial example posted, with
me giving up on the last scenario (I do not have the same patience as
Peter).

FWIW I have used SP-GiST indexes before with datasets similar to how
“bad2” is generated (though not nearly as dramatic as the upward increase
seen in the range) and have not run across this issue.

Jonathan

Attachment Content-Type Size
bad2.sql application/octet-stream 1.6 KB
good2.sql application/octet-stream 1.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2018-05-28 07:06:05 [PATCH] We install pg_regress and isolationtester but not pg_isolation_regress
Previous Message James Sewell 2018-05-28 04:36:49 Re: Undo logs