Re: Feature Request --- was: PostgreSQL Performance Tuning

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Feature Request --- was: PostgreSQL Performance Tuning
Date: 2007-05-02 02:59:51
Message-ID: Pine.GSO.4.64.0705012154380.24215@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Tue, 1 May 2007, Josh Berkus wrote:

> there is no standard way even within Linux to describe CPUs, for
> example. Collecting available disk space information is even worse. So
> I'd like some help on this portion.

I'm not fooled--secretly you and your co-workers laugh at how easy this is
on Solaris and are perfectly happy with how difficult it is on Linux,
right?

I joke becuase I've been re-solving some variant on this problem every few
years for a decade now and it just won't go away. Last time I checked the
right answer was to find someone else who's already done it, packaged that
into a library, and appears committed to keeping it up to date; just pull
a new rev of that when you need it. For example, for the CPU/memory part,
top solves this problem and is always kept current, so on open-source
platforms there's the potential to re-use that code. Now that I know
that's one thing you're (understandably) fighting with I'll dig up my
references on that (again).

> It's also hard/impossible to devise tuning algorithms that work for both
> gross tuning (increase shared_buffers by 100x) and fine tuning (decrease
> bgwriter_interval to 45ms).

I would advocate focusing on iterative improvements to an existing
configuration rather than even bothering with generating a one-off config
for exactly this reason. It *is* hard/impossible to get it right in a
single shot, because of how many parameters interact and the way
bottlenecks clear, so why not assume from the start you're going to do it
several times--then you've only got one piece of software to write.

The idea I have in my head is a tool that gathers system info, connects to
the database, and then spits out recommendations in order of expected
effectiveness--with the specific caveat that changing too many things at
one time isn't recommended, and some notion of parameter dependencies.
The first time you run it, you'd be told that shared_buffers was wildly
low, effective_cache_size isn't even in the right ballpark, and your
work_mem looks small relative to the size of your tables; fix those before
you bother doing anything else because any data collected with those at
very wrong values is bogus. Take two, those parameters pass their sanity
tests, but since you're actually running at a reasonable speed now the
fact that your tables are no longer being vacuumed frequently enough might
bubble to the top.

It would take a few passes through to nail down everything, but as long as
it's put together such that you'd be in a similar position to the
single-shot tool after running it once it would remove that as something
separate that needed to be built.

To argue against myself for a second, it may very well be the case that
writing the simpler tool is the only way to get a useful prototype for
building the more complicated one; very easy to get bogged down in feature
creep on a grand design otherwise.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Eddy D. Sanchez 2007-05-02 03:33:16 Re: pgsql and Mac OS X
Previous Message Tom Lane 2007-05-02 02:48:44 Re: multidimensional arrays

Browse pgsql-performance by date

  From Date Subject
Next Message david 2007-05-02 04:34:13 Re: Feature Request --- was: PostgreSQL Performance Tuning
Previous Message Greg Smith 2007-05-02 01:52:16 Re: Feature Request --- was: PostgreSQL Performance Tuning