Skip site navigation (1) Skip section navigation (2)

Re: More thoughts about planner's cost estimates

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: josh(at)agliodbs(dot)com, David Fetter <david(at)fetter(dot)org>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: More thoughts about planner's cost estimates
Date: 2006-06-03 19:18:14
Message-ID: 1149362295.3889.29.camel@localhost.localdomain (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
Ühel kenal päeval, R, 2006-06-02 kell 16:23, kirjutas Greg Stark:

> And a 5% sample is a pretty big. In fact my tests earlier showed the i/o from
> 5% block sampling took just as long as reading all the blocks. Even if we
> figure out what's causing that (IMHO surprising) result and improve matters I
> would only expect it to be 3-4x faster than a full scan.

You should not be surprised by this once you visualise what happens at
the disk level with all those platters spinning and heads moving :) 

Disks can read at full rotation speed, so skipping (not reading) some
blocks will not make reading the remaining blocks from the same track
faster. And if there are more than 20 8k pages per track, you still have
a very high probablility you need to read all tracks..

You may be able to move to the next track a little earlier compared to
reading all blocks, but then you are likely to miss the block from next
track and have to wait a full rotation.

You will get some win from skipping pages only once your % falls so low
that you can also skip a significant number of tracks.


Your test program could have got a little better results, if you had
somehow managed to tell the system all the block numbers to read in one
go, not each time the next one after hetting the previous one. In
current version it is quite likely that it had to wait several disk
rotations for even the sectors from the same track, as for small steps
it may have missed the next sector. It does not apply for disks which
always read a full track in RAM cache, but even there all tracks are
actually read.

The fact that 5% was not slower than seqscan seems to indicate that
actually all track reads were cached inside the disk or controller.

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:

In response to


pgsql-hackers by date

Next:From: Greg StarkDate: 2006-06-03 21:38:35
Subject: Re: More thoughts about planner's cost estimates
Previous:From: Mike BenoitDate: 2006-06-03 19:15:31
Subject: Re: More thoughts about planner's cost estimates

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group