Re: Overhauling GUCS

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overhauling GUCS
Date: 2008-06-05 03:43:25
Message-ID: Pine.GSO.4.64.0806042241070.23557@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 4 Jun 2008, Aidan Van Dyk wrote:

> I'd love a tool that helped me "analyze" my current running PG database
> (and yes, that includes getting *current* settings), and "suggest"
> config changes

Fine. To demonstrate why the overhaul is needed, let's start designing a
simple tool whose sole purpose in life is to suggest new settings for
shared_buffers and work_mem. Say we want to show people their current
setting and what we'd recommend instead.

I just created a new cluster on my laptop. It has the following in the
postgresql.conf:

shared_buffers = 32MB # min 128kB or max_connections*16kB
# (change requires restart)
#work_mem = 1MB # min 64kB

Say I first start designing such a tool by trying to read the
postgresql.conf file to figure out what these values are set to. In order
to accomplish that, I need to parse the whole file correctly, doing things
like turning "32MB" into the actual numeric value so my program can make
decisions based on its value[1]. This basically requires someone writing
a tuning tool replicate the GUC parsing code, which is crazy; at this
point you've already lost most potential tool authors.

But you're smarter than that; instead you use pg_settings:

psql=# select name,setting,unit from pg_settings where name='shared_buffers' or name='work_mem';
name | setting | unit
----------------+---------+------
shared_buffers | 4096 | 8kB
work_mem | 1024 | kB

Now: what do you tell the user their current value is? The way the
postgresql.conf is parsed into memory is lossy; at this point you don't
know anymore what units where specified in the original file. If someone
sees their current setting shown as "4096" but they know they set it to
"32MB", they'll end up confused unless they understand the whole page size
concept--and if they understood that, they'd automatically be disqualified
from being the type of user our theoretical tool is targeted toward. If
you try and make things more user-friendly by always showing the most
human readable version, what about the person who ended up setting this
parameter because they copied an old config file that recommended setting
it to 4096. You show it to them as "32MB"; they'll also be confused and
blame the tool for being bad.

And even if you work through all that, to give good advice here you have
to know things like that shared_buffers is a server parameter requiring
restart, while work_mem is a per-session parameter. Right now, the only
way to know all that is for tool authors to assemble their own database
and keep it up to date with each release. And you just lost another set
of potential authors with that realization.

Next up, we manage to work through all those issues, and someone happily
follows our advice and gets their file updated with a much larger value
for work_mem. Yeah, we are heroes! Or not. You see, in the config file
we just helpful updated for them was this comment just above that setting:

# OMG don't set this too high or the Linux OOM killer will
# take down the server!

(This is not a theoretical example; I have found variations on that text
in two postgresql.conf files and I vaguely recall Josh mentioned running
into it as well).

And now you just *crashed their server* by giving bad advice that was
clearly against the invaluable comment history in the file already. Oh,
but where are those comments located at? Before the setting? After the
setting? In a group at the top? Who can say? Since there's no strong
standard, people put them all over the place.

I don't know about you, but I'm too busy to spend a bunch of time writing
a tool to work around all these issues knowing it is always going to be
fragile, incomplete, and need updating with every GUC change no matter
what. Instead, reconsider http://wiki.postgresql.org/wiki/GUCS_Overhaul ,
having gotten a taste of the motivation behind those changes, and consider
how this would play out after those improvements.

The approach where you connect to the database and query is the easy path.
There is never a reason to consider parsing postgresql.conf. Anybody who
knows how to write a simple script that connects to a database and reads a
table (basically anyone who's written the database client equivilent of
"hello, world") can feel like a potential tool author.

All the information about the postgresql.conf side of every setting is
retained in case you want to show where they came from, or to generate a
new file that's as similar as possible to the original.

Any recommendations you suggest can be trivially annotated with whether
you need to consider max_connections because it's per-session, and whether
people need to restart the server or can just send it a signal, and that
will continue to be the case in the future with minimal work on the tool
author's part.

The defaults are now available, so that it's easy to figure out what
people changed. That is sometimes handy to include as part of this sort
of analysis, and it's necessary to provide improvements like a "strip the
unnecessary junk out of this file" that many people would like from this
sort of tool.

When you show people that you recommend increasing a value to something
larger, any comments about that setting will be shown and they'll know not
to follow the tool's advice if there's a history there.

This seems like such a better place to be that I'd rather drive toward the
server-side changes necessary to support it rather than fight the
difficult tool creation problems. That's why the "focus on a new API for
'writing my config' for me"; that particular goal is just one part of a
set of revisions that streamline the tool creation process in a not
necessarily obvious way. Unless, of course, you've tried to write a
full-circle config tuning tool, in which case most of the proposed changes
in this overhaul jump right out at you.

[1] In the shared_buffers case, it may be possible to just recommend a
value without caring one bit what the current one is. But for work_mem,
you really need to actually understand the value if you want any real
intelligence that combines that information with the maximum connections,
so that you can compute how much memory is left over for things like
effective_cache_size.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2008-06-05 03:59:27 Re: Proposal: new function array_init
Previous Message Aidan Van Dyk 2008-06-05 02:37:20 Re: Overhauling GUCS