Re: Initial prefetch performance testing

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Initial prefetch performance testing
Date: 2008-09-22 15:46:23
Message-ID: 877i94kwts.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:

> On Mon, 2008-09-22 at 04:57 -0400, Greg Smith wrote:
>
>> -As Greg Stark suggested, the larger the spindle count the larger the
>> speedup, and the larger the prefetch size that might make sense. His
>> suggestion to model the user GUC as "effective_spindle_count" looks like a
>> good one. The sequential scan fadvise implementation patch submitted uses
>> the earlier preread_pages name for that parameter, which I agree seems
>> less friendly.
>
> Good news about the testing.
>
> I'd prefer to set this as a tablespace level storage parameter.

Sounds, like a good idea, except... what's a tablespace level storage parameter?

> prefetch_... is a much better name since its an existing industry term.
> I'm not in favour of introducing the concept of spindles, since I can
> almost hear the questions about ramdisks and memory-based storage. Plus
> I don't ever want to discover that the best setting for
> effective_spindles is 7 (or 5) when I have 6 disks because of some
> technology shift or postgres behaviour change in the future.

In principle I quite strongly disagree with this.

Someone might very well want to set spindle_count to 6 when he actually has 7
but at least he can have an intuitive feel for what he's doing -- he's setting
it to slightly less than Postgres thinks is optimal.

Number of blocks to prefetch is an internal implementation detail that the DBA
has absolutely no way to know what the correct value is. That's how we get the
cargo cult configuration tweaks we've seen in the past where people follow
recommendations with no idea what the consequences are or whether they apply.

In an ideal world we would have a half-dozen parameters to tell Postgres how
much memory is available, how many disks available, etc and Postgres would
know how best to use the resources. I think if we expose internal knobs like
you propose then we end up with hundreds of parameters and to adjust them
you'll have to be an expert in Postgres internals.

That said, there is a place for these internal knobs when we don't really know
how to best make use of resources. At this point we only have results from a
few systems and the results don't seem to jibe with the theory.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-09-22 16:24:28 Re: parallel pg_restore
Previous Message Zdenek Kotala 2008-09-22 15:41:46 Re: FSM patch - performance test