RE: pgsql is 75 times faster with my new index scan

From: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
To: "'devik(at)cdi(dot)cz'" <devik(at)cdi(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: RE: pgsql is 75 times faster with my new index scan
Date: 2000-09-26 19:09:29
Message-ID: 8F4C99C66D04D4118F580090272A7A23018D0E@SECTORBASE1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Why not implement *true* CLUSTER?
With cluster, all heap tuples will be in cluster index.

Vadim

> -----Original Message-----
> From: devik(at)cdi(dot)cz [mailto:devik(at)cdi(dot)cz]
> Sent: Tuesday, September 26, 2000 2:15 AM
> To: pgsql-hackers(at)postgresql(dot)org
> Subject: [HACKERS] pgsql is 75 times faster with my new index scan
>
>
> Hello,
> I recently spoke about extending index scan to be able
> to take data directly from index pages. I wanted to know
> whether should I spend my time and implement it.
> So that I hacked last pgsql a bit to use proposed scan
> mode and did some measurements (see bellow). Measurements
> was done on (id int,txt varchar(20)) table with 1 000 000 rows
> with btree index on both attrs. Query involved was:
> select id,count(txt) from big group by id;
> Duplicates distribution on id column was 1:1000. I was run
> query twice after linux restart to ensure proper cache
> utilization (on disk heap & index was 90MB in total).
> So I think that by implementing this scan mode we can expect
> to gain huge speedup in all queries which uses indices and
> can found all data in their pages.
>
> Problems:
> my changes implemented only indexscan and new cost function.
> it doesn't work when index pages contains tuples which doesn't
> belong to our transaction. test was done after vacuum and
> only one tx running.
>
> TODO:
> - add HeapTupleHeaderData into each IndexTupleData
> - change code to reflect above
> - when deleting-updating heap then also update tuples'
> HeapTupleHeaderData in indices
>
> The last step could be done in two ways. First by limiting
> number of indices for one table we can store coresponding
> indices' TIDs in each heap tuple. The update is then simple
> taking one disk write.
> Or do it in standart way - lookup appropriate index tuple
> by traversing index. It will cost us more disk accesses.
>
> Is someone interested in this ??
> regards devik
>
> With current indexscan:
> ! system usage stats:
> ! 1812.534505 elapsed 93.060547 user 149.447266 system sec
> ! [93.118164 user 149.474609 sys total]
> ! 0/0 [0/0] filesystem blocks in/out
> ! 130978/32 [131603/297] page faults/reclaims, 132 [132] swaps
> ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
> ! 0/0 [0/0] voluntary/involuntary context switches
> ! postgres usage stats:
> ! Shared blocks: 555587 read, 551155 written, buffer hit
> rate = 44.68%
> ! Local blocks: 0 read, 0 written, buffer hit
> rate = 0.00%
> ! Direct blocks: 0 read, 0 written
>
> With improved indexscan:
> ! system usage stats:
> ! 23.686788 elapsed 22.157227 user 0.372071 system sec
> ! [22.193359 user 0.385742 sys total]
> ! 0/0 [0/0] filesystem blocks in/out
> ! 1186/42 [1467/266] page faults/reclaims, 0 [0] swaps
> ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
> ! 0/0 [0/0] voluntary/involuntary context switches
> ! postgres usage stats:
> ! Shared blocks: 4385 read, 0 written, buffer hit
> rate = 4.32%
> ! Local blocks: 0 read, 0 written, buffer hit
> rate = 0.00%
> ! Direct blocks: 0 read, 0 written
>
>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikheev, Vadim 2000-09-26 19:14:54 RE: pgsql is 75 times faster with my new index scan
Previous Message Mikheev, Vadim 2000-09-26 19:07:41 RE: pgsql is 75 times faster with my new index scan