Re: More thoughts about planner's cost estimates

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: josh(at)agliodbs(dot)com, David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: More thoughts about planner's cost estimates
Date: 2006-06-03 19:18:14
Message-ID: 1149362295.3889.29.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, R, 2006-06-02 kell 16:23, kirjutas Greg Stark:

> And a 5% sample is a pretty big. In fact my tests earlier showed the i/o from
> 5% block sampling took just as long as reading all the blocks. Even if we
> figure out what's causing that (IMHO surprising) result and improve matters I
> would only expect it to be 3-4x faster than a full scan.

You should not be surprised by this once you visualise what happens at
the disk level with all those platters spinning and heads moving :)

Disks can read at full rotation speed, so skipping (not reading) some
blocks will not make reading the remaining blocks from the same track
faster. And if there are more than 20 8k pages per track, you still have
a very high probablility you need to read all tracks..

You may be able to move to the next track a little earlier compared to
reading all blocks, but then you are likely to miss the block from next
track and have to wait a full rotation.

You will get some win from skipping pages only once your % falls so low
that you can also skip a significant number of tracks.

> http://archives.postgresql.org/pgsql-hackers/2006-01/msg00285.php

Your test program could have got a little better results, if you had
somehow managed to tell the system all the block numbers to read in one
go, not each time the next one after hetting the previous one. In
current version it is quite likely that it had to wait several disk
rotations for even the sectors from the same track, as for small steps
it may have missed the next sector. It does not apply for disks which
always read a full track in RAM cache, but even there all tracks are
actually read.

The fact that 5% was not slower than seqscan seems to indicate that
actually all track reads were cached inside the disk or controller.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-06-03 21:38:35 Re: More thoughts about planner's cost estimates
Previous Message Mike Benoit 2006-06-03 19:15:31 Re: More thoughts about planner's cost estimates