Skip site navigation (1) Skip section navigation (2)

Double sorting split patch

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Double sorting split patch
Date: 2011-09-11 19:30:11
Message-ID: CAPpHfdtkUfgdMwTsoNUiaKUM3HBQre1iZFiHjR2umqgTf5D_dQ@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Hackers,

I've got my patch with double sorting picksplit impementation for GiST into
more acceptable form. A little of testing is below. Index creation time is
slightly higher, but search is much faster. The testing datasets were
following:
1) uniform dataset - 10M rows
2) geonames points - 7.6M rows

test=# create index uniform_new_linear_idx on uniform using gist (point);
CREATE INDEX
Time: 397362,915 ms

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.5,0.5),point(0.501,0.501));
                                                             QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on uniform  (cost=433.27..25873.19 rows=10000 width=16)
(actual time=1.407..1.448
rows=8 loops=1)
   Recheck Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
   Buffers: shared hit=39
   ->  Bitmap Index Scan on uniform_new_linear_idx  (cost=0.00..430.77
rows=10000 width=0) (actual time=1.388..1.388 rows=8 loops=1)
         Index Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
         Buffers: shared hit=31
 Total runtime: 1.527 ms
(7 rows)

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.3,0.4),point(0.301,0.401));
                                                              QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on uniform  (cost=433.27..25873.19 rows=10000 width=16)
(actual time=0.715..0.795
rows=15 loops=1)
   Recheck Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
   Buffers: shared hit=30
   ->  Bitmap Index Scan on uniform_new_linear_idx  (cost=0.00..430.77
rows=10000 width=0) (actual time=0.695..0.695 rows=15 loops=1)
         Index Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
         Buffers: shared hit=15
 Total runtime: 0.892 ms
(7 rows)

test=# create index uniform_double_sorting_idx on uniform using gist
(point);
CREATE INDEX
Time: 492796,671 ms

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.5,0.5),point(0.501,0.501));
                                                               QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on uniform  (cost=445.39..25885.31 rows=10000 width=16)
(actual time=0.376..0.417
rows=8 loops=1)
   Recheck Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
   Buffers: shared hit=15
   ->  Bitmap Index Scan on uniform_double_sorting_idx  (cost=0.00..442.89
rows=10000 width=0) (actual time=0.357..0.357 rows=8 loops=1)
         Index Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
         Buffers: shared hit=7
 Total runtime: 0.490 ms
(7 rows)

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.3,0.4),point(0.301,0.401));
                                                                QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on uniform  (cost=445.39..25885.31 rows=10000 width=16)
(actual time=0.189..0.270
rows=15 loops=1)
   Recheck Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
   Buffers: shared hit=19
   ->  Bitmap Index Scan on uniform_double_sorting_idx  (cost=0.00..442.89
rows=10000 width=0) (actual time=0.168..0.168 rows=15 loops=1)
         Index Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
         Buffers: shared hit=4
 Total runtime: 0.358 ms
(7 rows)

test=# create index geonames_new_linear_idx on geonames using gist (point);
CREATE INDEX
Time: 279922,518 ms


test=# explain (analyze, buffers) select * from geonames where point <@
box(point(34.4671,126.631),point(34.5023,126.667));
                                                              QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on geonames  (cost=341.98..19686.88 rows=7604 width=16)
(actual time=0.905..0.948
rows=11 loops=1)
   Recheck Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
   Buffers: shared hit=25
   ->  Bitmap Index Scan on geonames_new_linear_idx  (cost=0.00..340.07
rows=7604 width=0) (actual time=0.889..0.889 rows=11 loops=1)
         Index Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
         Buffers: shared hit=20
 Total runtime: 1.029 ms
(7 rows)

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(46.1384,-104.72), point(46.2088,-104.65));
                                                              QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on geonames  (cost=341.98..19686.88 rows=7604 width=16)
(actual time=0.644..0.776
rows=10 loops=1)
   Recheck Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
   Buffers: shared hit=13 read=6
   ->  Bitmap Index Scan on geonames_new_linear_idx  (cost=0.00..340.07
rows=7604 width=0) (actual time=0.595..0.595 rows=10 loops=1)
         Index Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
         Buffers: shared hit=13
 Total runtime: 0.857 ms
(7 rows)

test=# create index geonames_double_sorting_idx on geonames using gist
(point);
CREATE INDEX
Time: 294580,774 ms

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(34.4671,126.631),point(34.5023,126.667));
                                                                QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on geonames  (cost=346.01..19690.92 rows=7604 width=16)
(actual time=0.240..0.282
rows=11 loops=1)
   Recheck Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
   Buffers: shared hit=11
   ->  Bitmap Index Scan on geonames_double_sorting_idx  (cost=0.00..344.11
rows=7604 width=0) (actual time=0.209..0.209 rows=11 loops=1)
         Index Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
         Buffers: shared hit=6
 Total runtime: 0.372 ms
(7 rows)

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(46.1384,-104.72), point(46.2088,-104.65));
                                                                QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on geonames  (cost=346.01..19690.92 rows=7604 width=16)
(actual time=0.311..0.352
rows=10 loops=1)
   Recheck Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
   Buffers: shared hit=13
   ->  Bitmap Index Scan on geonames_double_sorting_idx  (cost=0.00..344.11
rows=7604 width=0) (actual time=0.293..0.293 rows=10 loops=1)
         Index Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
         Buffers: shared hit=7
 Total runtime: 0.429 ms
(7 rows)

------
With best regards,
Alexander Korotkov.

Attachment: double-sorting-split-0.1.patch.gz
Description: application/x-gzip (6.8 KB)

Responses

pgsql-hackers by date

Next:From: Rob WultschDate: 2011-09-11 19:33:28
Subject: Re: pg_dump.c
Previous:From: Andrew DunstanDate: 2011-09-11 19:18:13
Subject: Re: pg_dump.c

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group