Re: Blocks read for index scans

From: Terje Elde <terje(at)elde(dot)net>
To: Jim Nasby <jnasby(at)pervasive(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Blocks read for index scans
Date: 2006-04-14 06:05:39
Message-ID: 443F3BB3.30404@elde.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jim Nasby wrote:
> While working on determining a good stripe size for a database, I
> realized it would be handy to know what the average request size is.
> Getting this info is a simple matter of joining pg_stat_all_tables and
> pg_statio_all_tables and doing some math, but there's one issue I've
> found; it appears that there's no information on how many heap blocks
> were read in by an index scan. Is there any way to get that info?

RAID usually doesn't work the way most people think. ;)

Not sure how well you know RAID, so I'm just mentioning some points just
in case, and for the archives.

If your average request is for 16K, and you choose a 16K stripe size,
then that means half your request (assuming normal bell curve) would be
larger than a single stripe, and you've just succeeded in having half
your requests have to have two spindles seek instead of one. If that's
done sequentially, you're set for less than half the performance of a
flat disk.

Knowing what the average stripe size is can be a good place to start,
but the real question is; which stripe size will allow the majority of
your transactions to be possible to satisfy without having to go to two
spindles?

I've actually had good success with 2MB stripe sizes using software
raid. If the reads are fairly well distributed, all the drives are hit
equally, and very few small requests have to go to two spindles.

Read speeds from modern drives are fast. It's usually the seeks that
kill performance, so making sure you reduce the number of seeks should
almost always be the priority.

That said, it's the transactions against disk that typically matter. On
FreeBSD, you can get an impression of this using 'systat -vmstat', and
watch the KB/t column for your drives.

A seek will take some time, the head has to settle down, find the right
place to start reading etc, so a seek will always take time. A seek
over a longer distance takes more time though, so even if your
transactions are pretty small, using a large stripe size can be a good
thing if your have lots of small transactions that are close by. The
head will be in the area, reducing seek time.

This all depends on what types of load you have, and it's hard to
generalize too much on what makes things fast. As always, it pretty
much boils down to trying things while running as close to production
load as you can.

Terje

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-04-14 06:14:14 Re: Blocks read for index scans
Previous Message Jim C. Nasby 2006-04-14 05:59:23 Re: Inserts optimization?