Re: Feature Request --- was: PostgreSQL Performance Tuning

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>
Subject: Re: Feature Request --- was: PostgreSQL Performance Tuning
Date: 2007-05-01 16:23:47
Message-ID: 200705010923.47951.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Greg,

> 1) Collect up data about their system (memory, disk layout), find out a
> bit about their apps/workload, and generate a config file based on that.

We could start with this. Where I bogged down is that collecting system
information about several different operating systems ... and in some cases
generating scripts for boosting things like shmmax ... is actually quite a
large problem from a slog perspective; there is no standard way even within
Linux to describe CPUs, for example. Collecting available disk space
information is even worse. So I'd like some help on this portion.

I actually have algorithms which are "good enough to start with" for most of
the important GUCs worked out, and others could be set through an interactive
script ("Does your application require large batch loads involving thousands
or millions of updates in the same transaction?" "How large (GB) do you
expect your database to be?")

> 2) Connect to the database and look around. Study the tables and some
> their stats, make some estimates based on what your find, produce a new
> config file.

I'm not sure that much more for (2) can be done than for (1). Tables-on-disk
don't tell us much.

> 3) Monitor the database while it's doing its thing. See which parts go
> well and which go badly by viewing database statistics like pg_statio.
> From that, figure out where the bottlenecks are likely to be and push more
> resources toward them. What I've been working on lately is exposing more
> readouts of performance-related database internals to make this more
> practical.

We really should collaborate on that.

> When first exposed to this problem, most people assume that (1) is good
> enough--ask some questions, look at the machine, and magically a
> reasonable starting configuration can be produced. It's already been
> pointed out that anyone with enough knowledge to do all that can probably
> spit out a reasonable guess for the config file without help.

But that's actually more than most people already do. Further, if you don't
start with a "reasonable" configuration, then it's difficult-impossible to
analyze where your settings are out-of-whack; behavior introduced by some
way-to-low settings will mask any other tuning that needs to be done. It's
also hard/impossible to devise tuning algorithms that work for both gross
tuning (increase shared_buffers by 100x) and fine tuning (decrease
bgwriter_interval to 45ms).

So whether or not we do (3), we need to do (1) first.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2007-05-01 16:27:47 Re: Processing a work queue
Previous Message Tom Lane 2007-05-01 16:02:55 Re: dump-restore only one table

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Haumer 2007-05-01 17:05:14 Re: sytem log audit/reporting and psql
Previous Message Gregory Stark 2007-04-30 19:19:16 Re: Query performance problems with partitioned tables