Double sorting split patch

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Double sorting split patch
Date: 2011-09-11 19:30:11
Message-ID: CAPpHfdtkUfgdMwTsoNUiaKUM3HBQre1iZFiHjR2umqgTf5D_dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

I've got my patch with double sorting picksplit impementation for GiST into
more acceptable form. A little of testing is below. Index creation time is
slightly higher, but search is much faster. The testing datasets were
following:
1) uniform dataset - 10M rows
2) geonames points - 7.6M rows

test=# create index uniform_new_linear_idx on uniform using gist (point);
CREATE INDEX
Time: 397362,915 ms

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.5,0.5),point(0.501,0.501));
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on uniform (cost=433.27..25873.19 rows=10000 width=16)
(actual time=1.407..1.448
rows=8 loops=1)
Recheck Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
Buffers: shared hit=39
-> Bitmap Index Scan on uniform_new_linear_idx (cost=0.00..430.77
rows=10000 width=0) (actual time=1.388..1.388 rows=8 loops=1)
Index Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
Buffers: shared hit=31
Total runtime: 1.527 ms
(7 rows)

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.3,0.4),point(0.301,0.401));
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on uniform (cost=433.27..25873.19 rows=10000 width=16)
(actual time=0.715..0.795
rows=15 loops=1)
Recheck Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
Buffers: shared hit=30
-> Bitmap Index Scan on uniform_new_linear_idx (cost=0.00..430.77
rows=10000 width=0) (actual time=0.695..0.695 rows=15 loops=1)
Index Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
Buffers: shared hit=15
Total runtime: 0.892 ms
(7 rows)

test=# create index uniform_double_sorting_idx on uniform using gist
(point);
CREATE INDEX
Time: 492796,671 ms

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.5,0.5),point(0.501,0.501));
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on uniform (cost=445.39..25885.31 rows=10000 width=16)
(actual time=0.376..0.417
rows=8 loops=1)
Recheck Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
Buffers: shared hit=15
-> Bitmap Index Scan on uniform_double_sorting_idx (cost=0.00..442.89
rows=10000 width=0) (actual time=0.357..0.357 rows=8 loops=1)
Index Cond: (point <@ '(0.501,0.501),(0.5,0.5)'::box)
Buffers: shared hit=7
Total runtime: 0.490 ms
(7 rows)

test=# explain (analyze, buffers) select * from uniform where point <@
box(point(0.3,0.4),point(0.301,0.401));
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on uniform (cost=445.39..25885.31 rows=10000 width=16)
(actual time=0.189..0.270
rows=15 loops=1)
Recheck Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
Buffers: shared hit=19
-> Bitmap Index Scan on uniform_double_sorting_idx (cost=0.00..442.89
rows=10000 width=0) (actual time=0.168..0.168 rows=15 loops=1)
Index Cond: (point <@ '(0.301,0.401),(0.3,0.4)'::box)
Buffers: shared hit=4
Total runtime: 0.358 ms
(7 rows)

test=# create index geonames_new_linear_idx on geonames using gist (point);
CREATE INDEX
Time: 279922,518 ms

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(34.4671,126.631),point(34.5023,126.667));
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on geonames (cost=341.98..19686.88 rows=7604 width=16)
(actual time=0.905..0.948
rows=11 loops=1)
Recheck Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
Buffers: shared hit=25
-> Bitmap Index Scan on geonames_new_linear_idx (cost=0.00..340.07
rows=7604 width=0) (actual time=0.889..0.889 rows=11 loops=1)
Index Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
Buffers: shared hit=20
Total runtime: 1.029 ms
(7 rows)

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(46.1384,-104.72), point(46.2088,-104.65));
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on geonames (cost=341.98..19686.88 rows=7604 width=16)
(actual time=0.644..0.776
rows=10 loops=1)
Recheck Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
Buffers: shared hit=13 read=6
-> Bitmap Index Scan on geonames_new_linear_idx (cost=0.00..340.07
rows=7604 width=0) (actual time=0.595..0.595 rows=10 loops=1)
Index Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
Buffers: shared hit=13
Total runtime: 0.857 ms
(7 rows)

test=# create index geonames_double_sorting_idx on geonames using gist
(point);
CREATE INDEX
Time: 294580,774 ms

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(34.4671,126.631),point(34.5023,126.667));
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on geonames (cost=346.01..19690.92 rows=7604 width=16)
(actual time=0.240..0.282
rows=11 loops=1)
Recheck Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
Buffers: shared hit=11
-> Bitmap Index Scan on geonames_double_sorting_idx (cost=0.00..344.11
rows=7604 width=0) (actual time=0.209..0.209 rows=11 loops=1)
Index Cond: (point <@ '(34.5023,126.667),(34.4671,126.631)'::box)
Buffers: shared hit=6
Total runtime: 0.372 ms
(7 rows)

test=# explain (analyze, buffers) select * from geonames where point <@
box(point(46.1384,-104.72), point(46.2088,-104.65));
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on geonames (cost=346.01..19690.92 rows=7604 width=16)
(actual time=0.311..0.352
rows=10 loops=1)
Recheck Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
Buffers: shared hit=13
-> Bitmap Index Scan on geonames_double_sorting_idx (cost=0.00..344.11
rows=7604 width=0) (actual time=0.293..0.293 rows=10 loops=1)
Index Cond: (point <@ '(46.2088,-104.65),(46.1384,-104.72)'::box)
Buffers: shared hit=7
Total runtime: 0.429 ms
(7 rows)

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
double-sorting-split-0.1.patch.gz application/x-gzip 6.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rob Wultsch 2011-09-11 19:33:28 Re: pg_dump.c
Previous Message Andrew Dunstan 2011-09-11 19:18:13 Re: pg_dump.c