Re: WIP: Fast GiST index build

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Fast GiST index build
Date: 2011-06-06 12:00:28
Message-ID: BANLkTimnkr-ECezSXDF6ZNkt1r1KNnajPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Mon, Jun 6, 2011 at 2:51 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 06.06.2011 10:42, Heikki Linnakangas wrote:
>>
>> I ran another test with a simple table generated with:
>>
>> CREATE TABLE pointtest (p point);
>> INSERT INTO pointtest SELECT point(random(), random()) FROM
>> generate_series(1,50000000);
>>
>> Generating a gist index with:
>>
>> CREATE INDEX i_pointtest ON pointtest USING gist (p);
>>
>> took about 15 hours without the patch, and 2 hours with it. That's quite
>> dramatic.
>>
>
> Oops, that was a rounding error, sorry. The run took about 2.7 hours with
> the patch, which of course should be rounded to 3 hours, not 2. Anyway, it
> is still a very impressive improvement.
>
I have similar results on 100 millions of rows: 21.6 hours without patch and
2 hours with patch. But I found a problem: index quality is worse. See
following query plans. There test is relation where index was created in
ordinal way, and test2 is relation where patch was used.

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=4391.01..270397.31 rows=100000 width=20)
(actual time=1.257..2.147 rows=838 loops=1)
Recheck Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)
Buffers: shared hit=968
-> Bitmap Index Scan on test_idx (cost=0.00..4366.01 rows=100000
width=0) (actual time=1.162..1.162 rows=838 loops=1)
Index Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)
Buffers: shared hit=131
Total runtime: 2.214 ms
(7 rows)

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4370.84..270377.13 rows=100000 width=20)
(actual time=5.252..6.056 rows=838 loops=1)
Recheck Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)
Buffers: shared hit=1458
-> Bitmap Index Scan on test2_idx (cost=0.00..4345.84 rows=100000
width=0) (actual time=5.155..5.155 rows=838 loops=1)
Index Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)
Buffers: shared hit=621
Total runtime: 6.121 ms
(7 rows)

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=4391.01..270397.31 rows=100000 width=20)
(actual time=2.148..2.977 rows=850 loops=1)
Recheck Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)
Buffers: shared hit=1099
-> Bitmap Index Scan on test_idx (cost=0.00..4366.01 rows=100000
width=0) (actual time=2.052..2.052 rows=850 loops=1)
Index Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)
Buffers: shared hit=249
Total runtime: 3.033 ms
(7 rows)

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=4370.84..270377.13 rows=100000 width=20)
(actual time=6.806..7.602 rows=850 loops=1)
Recheck Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)
Buffers: shared hit=1615
-> Bitmap Index Scan on test2_idx (cost=0.00..4345.84 rows=100000
width=0) (actual time=6.709..6.709 rows=850 loops=1)
Index Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)
Buffers: shared hit=773
Total runtime: 7.667 ms
(7 rows)

We can see that index scan requires read of several times more
pages. Original paper denotes such effect. It explains it by the routing
rectangles in less optimal ways. But this effect wasn't so dramatic in tests
provided in the paper. So, I have following thoughts about this problem:

1) Number of pages, which was readed from index is too large even with
ordinal index build. Querying of small area requires read of hundred of
pages. It probbably caused by picksplit implementation. I've version of
picksplit algorithm which seems to be much more efficient. I'll do some
benchmarks with my picksplit algorithm. I hope difference in index quality
will be not so dramatic.

2) I can try to do some enchancements in fast build alogrithms which could
improve tree quality. In original paper Hilbert heuristic was used to achive
even better tree quality than tree which was created in ordinal way. But
since we use GiST we are restricted by it's interface (or we have to create
new interface functions(s), but I like to avoid it). I would like to try to
do some ordering by penalty value in buffer emptying process and buffers
relocation on split.

3) Probably, there is some bug which affects tree quality.

> Could you please create a TODO list on the wiki page, listing all the
> missing features, known bugs etc. that will need to be fixed? That'll make
> it easier to see how much work there is left. It'll also help anyone looking
> at the patch to know which issues are known issues.
>
Sure. I'll create such list on wiki page. I believe that currenlty most
important issue is index quality.

> Meanwhile, it would still be very valuable if others could test this with
> different workloads. And Alexander, it would be good if at some point you
> could write some benchmark scripts too, and put them on the wiki page, just
> to see what kind of workloads have been taken into consideration and tested
> already. Do you think there's some worst-case data distributions where this
> algorithm would perform particularly badly?

I don't expect some bad cases in terms in IO. My most worrying is about
index quality which is strongly related to data distribution.

------
With best regards,
Alexander Korotkov.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-06-06 12:02:07 Re: reducing the overhead of frequent table locks - now, with WIP patch
Previous Message Robert Haas 2011-06-06 11:59:58 Re: reducing the overhead of frequent table locks - now, with WIP patch