Re: WIP: SP-GiST, Space-Partitioned GiST

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: SP-GiST, Space-Partitioned GiST
Date: 2011-09-22 10:05:33
Message-ID: 4E7B086D.5050602@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06.09.2011 20:34, Oleg Bartunov wrote:
> Here is the latest spgist patch, which has all planned features as well as
> all overhead, introduced by concurrency and recovery, so performance
> measurement should be realistic now.

I'm ignoring the text suffix-tree part of this for now, because of the
issue with non-C locales that Alexander pointer out.

Regarding the quadtree, have you compared the performance of that with
Alexander's improved split algorithm? I ran some tests using the test
harness I still had lying around from the fast GiST index build tests:

testname | time | accesses | indexsize
-------------------------+-----------------+----------+-----------
points unordered auto | 00:03:58.188866 | 378779 | 522 MB
points ordered auto | 00:07:14.362355 | 177534 | 670 MB
points unordered auto | 00:02:59.130176 | 46561 | 532 MB
points ordered auto | 00:04:00.50756 | 45066 | 662 MB
points unordered spgist | 00:03:05.569259 | 78871 | 394 MB
points ordered spgist | 00:01:46.06855 | 422104 | 417 MB
(8 rows)

These tests were with a table with 7500000 random points. In the
ordered-tests, the table is sorted by x,y coordinates. 'time' is the
time used to build the index on it, and 'accesses' is the total number
of index blocks hit by a series of 10000 bounding box queries, measured
from pg_statio_user_indexes.idx_blks_hit + idx_blks_read.

The first two tests in the list are with a GiST index on unpatched
PostgreSQL. The next six tests are with Alexander's double-sorting split
patch. The last two tests are with an SP-GiST index.

It looks like the query performance with GiST using the double-sorting
split is better than SP-GiST, although the SP-GiST index is somewhat
smaller. The ordered case seems pathologically bad, is that some sort of
a worst-case scenario for quadtrees?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MUHAMMAD ASIF 2011-09-22 10:51:11 Re: PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX
Previous Message Gunnlaugur Þór Briem 2011-09-22 09:43:25 Re: Constraint exclusion on UNION ALL subqueries with WHERE conditions