Re: location of the configuration files

From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: location of the configuration files
Date: 2003-02-13 07:36:45
Message-ID: 20030213073645.GI1833@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Before I get started, I should note that it may be a good compromise
to have the data directory be the same as the config file directory,
when neither the config file nor the command line specify something
different. So the changes I think may make the most sense are:

1. We add a new GUC variable which specifies where the data is.
The data is assumed to reside in the same place the config files
reside unless the GUC variable is defined (either in
postgresql.conf or on the command line, as usual for a GUC
variable). Both -D and $PGDATA therefore retain their current
semantics unless overridden by the GUC variable, in which case
they fall back to the new semantics of specifying only where the
config files can be found.

2. We add a configure option that specifies what the hardcoded
fallback directory should be when neither -D nor $PGDATA are
specified: /etc/postgresql when the option isn't specified to
configure.

3. We supply a different default startup script and a different
default configuration file (but can make the older versions
available in the distribution as well if we wish). The former
uses neither $PGDATA nor -D (or uses /etc/postgresql for them),
and the latter uses the new GUC variable to specify a data
directory location (/var/lib/postgres by default?)

This combination should work nicely for transitioning and for package
builders. It accomplishes all of the goals mentioned in this thread
and will cause minimal pain for developers, since they can use their
current methods. Sounds like it'll make Tom happy, at least. :-)

Tom Lane wrote:
> mlw <pgsql(at)mohawksoft(dot)com> writes:
> > The idea that a, more or less, arbitrary data location determines the
> > database configuration is wrong. It should be obvious to any
> > administrator that a configuration file location which controls the
> > server is the "right" way to do it.
>
> I guess I'm just dense, but I entirely fail to see why this is the One
> True Way To Do It.

But we're not saying it's the One True Way, just saying that it's a
way that has very obvious benefits over the way we're using now, if
your job is to manage a system that someone else set up.

> What you seem to be proposing (ignoring syntactic-sugar issues) is
> that we replace "postmaster -D /some/data/dir" by "postmaster
> -config /some/config/file". I am not seeing the nature of the
> improvement.

The nature of the improvement is that the configuration of a
PostgreSQL install will becomes obvious to anyone who looks in the
obvious places. Remember, the '-D ...' is optional! The PGDATA
environment variable can be used instead, and *is* used in what few
installations I've seen. That's not something that shows up on the
command line when looking at the process list, which forces the
administrator to hunt down the data directory through other means.

> It looks to me like the sysadmin must now grant the Postgres DBA
> write access on *two* directories, viz /some/config/ and
> /wherever/the/data/directory/is. How is that better than granting
> write access on one directory?

The difference in where you grant write access isn't a benefit to be
gained here. The fact that you no longer have to give root privileges
to the DBA so that he can change the data directory as needed is the
benefit (well, one of them, at least). A standard packaged install
can easily set the /etc/postgresql directory up with write permissions
for the postgres user by default, so the sysadmin won't even have to
touch it if he doesn't want to.

A big production database box is usually managed by one or more system
administrators and one or more DBAs. Their roles are largely
orthogonal. The sysadmins have the responsibility of keeping the
boxes up and making sure they don't fall over or crawl to a
standstill. The DBAs have the responsibility of maximizing the
performance and availability of the database and *that's all*. Giving
the DBAs root privileges means giving them the power to screw up the
system in ways that they can't recover from and might not even know
about. The ways you can take down a system by misconfiguring the
database are bad enough. No sane sysadmin is going to give the DBA
the power to run an arbitrary script as root at a time during the boot
cycle that the system is the most difficult to manage unless he thinks
the DBA is *really* good at system administration tasks, too. And
that's assuming the sysadmin even *has* the authority to grant the DBA
that kind of access. Many organizations keep a tight rein on who can
do what in an effort to minimize the damage from screwups.

The point is that the DBA isn't likely to have root access to the box.
When the DBA lacks that ability, the way we currently do things places
greater demand on the sysadmin than is necessary, because root access
is required to change the startup scripts, as it should be, and the
location of the data, as it should *not* be.

> Given that we can't manage to standardize the data directory
> location across multiple Unixen, how is it that we will be more
> successful at standardizing a config file location?

A couple of ways.

Firstly, as we mentioned before, just about every other daemon that
runs on a Unix system has its configuration file somewhere in the /etc
hierarchy. By putting our config files in that same hierarchy we'll
be *adhering* to a standard. We don't have to worry about
"standardizing" that config file location because it's *already* a
standard that we're currently ignoring.

Secondly, standards arise as a result of being declared standards and
by most people using them. So simply by making /etc/postgresql the
default configuration directory, *that* will become the standard
place. Most people won't mess with the default install if they don't
have to.

Right now they almost *have to* mess with the default install, because
there is no standard place on a Unix system for high speed, highly
reliable disk access. And that means that, right now, there *is* no
standard place for our config files -- it's wherever the person who
configured the database decided the data should be, and he made that
decision based on performance and reliability considerations, not on
any standards.

> All I see here is an arbitrary break with our past practice. I do not
> see any net improvement.

That's probably because you're looking at this from the point of view
of a developer. From that standpoint there really isn't any net
improvement, because *you* still have to specify something on the
command line to get your test databases going. As a developer you
*always* install and manage your own database installations, so *of
course* you'll always know where the config files are. But that's not
how it works in the production world.

The break we'd be making is *not* arbitrary, and that's much of the
point: it's a break towards existing standards, and there are good
reasons for doing it, benefits to be had by adhering to those
standards.

The way we currently handle configuration files is fine for research
and development use -- the environment from which PostgreSQL sprang.
But now we're talking about getting it used in production
environments, and their requirements are very different.

Since it is *we* who are not currently adhering to the standard,
shouldn't the burden of proof (so to speak) be on those who wish to
keep things as they are?

--
Kevin Brown kevin(at)sysexperts(dot)com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Kalchev 2003-02-13 08:10:31 Re: Brain dump: btree collapsing
Previous Message Daniel Kalchev 2003-02-13 07:30:35 Re: [HACKERS] Changing the default configuration