Skip site navigation (1) Skip section navigation (2)

pgsql is 75 times faster with my new index scan

From: devik(at)cdi(dot)cz
To: pgsql-hackers(at)postgresql(dot)org
Subject: pgsql is 75 times faster with my new index scan
Date: 2000-09-26 09:15:28
Message-ID: 39D06930.F2F88139@cdi.cz (view raw or flat)
Thread:
Lists: pgsql-hackers
Hello,
I recently spoke about extending index scan to be able
to take data directly from index pages. I wanted to know
whether should I spend my time and implement it.
So that I hacked last pgsql a bit to use proposed scan
mode and did some measurements (see bellow). Measurements
was done on (id int,txt varchar(20)) table with 1 000 000 rows
with btree index on both attrs. Query involved was:
select id,count(txt) from big group by id;
Duplicates distribution on id column was 1:1000. I was run
query twice after linux restart to ensure proper cache 
utilization (on disk heap & index was 90MB in total).
So I think that by implementing this scan mode we can expect
to gain huge speedup in all queries which uses indices and
can found all data in their pages.

Problems:
my changes implemented only indexscan and new cost function.
it doesn't work when index pages contains tuples which doesn't
belong to our transaction. test was done after vacuum and
only one tx running.

TODO:
- add HeapTupleHeaderData into each IndexTupleData
- change code to reflect above
- when deleting-updating heap then also update tuples'
  HeapTupleHeaderData in indices

The last step could be done in two ways. First by limiting
number of indices for one table we can store coresponding 
indices' TIDs in each heap tuple. The update is then simple
taking one disk write.
Or do it in standart way - lookup appropriate index tuple
by traversing index. It will cost us more disk accesses.

Is someone interested in this ??
regards devik

With current indexscan:
! system usage stats:
!       1812.534505 elapsed 93.060547 user 149.447266 system sec
!       [93.118164 user 149.474609 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       130978/32 [131603/297] page faults/reclaims, 132 [132] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:     555587 read,     551155 written, buffer hit
rate = 44.68%
!       Local  blocks:          0 read,          0 written, buffer hit
rate = 0.00%
!       Direct blocks:          0 read,          0 written

With improved indexscan:
! system usage stats:
!       23.686788 elapsed 22.157227 user 0.372071 system sec
!       [22.193359 user 0.385742 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       1186/42 [1467/266] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:       4385 read,          0 written, buffer hit
rate = 4.32%
!       Local  blocks:          0 read,          0 written, buffer hit
rate = 0.00%
!       Direct blocks:          0 read,          0 written



Responses

pgsql-hackers by date

Next:From: Hannu KrosingDate: 2000-09-26 09:28:07
Subject: Re: pgsql is 75 times faster with my new index scan
Previous:From: Jarmo PaavilainenDate: 2000-09-26 07:39:59
Subject: Case sensitive field names

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group