Re: Bug with pg_ctl -w/wait and config-only directories

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: "Mr(dot) Aaron W(dot) Swenson" <titanofold(at)gentoo(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug with pg_ctl -w/wait and config-only directories
Date: 2011-10-03 15:27:09
Message-ID: 201110031527.p93FR9f01296@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> On Sun, Oct 2, 2011 at 7:54 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > What exactly is your question? ?You are not using a config-only
> > directory but the real data directory, so it should work fine.
>
> No. He is using PGDATA (= /etc/postgresql-9.0) as a config-only
> directory, and DATA_DIR (= /var/lib/postgresql/9.0/data) as a
> real data directory.

Wow, I see what you mean now! So the user already figured out it was
broken and used the workaround I recently discovered? Was this ever
reported to the community? If so, I never saw it.

So, in testing, I see it is even more broken than I thought. Not only
is pg_ctl -w broken for start/stop for config-only installs, but pg_ctl
stop (no -w) is also broken because it can't find the postmaster.pid
file to check or use to get the pid to send the signal. pg_ctl reload
and restart are similarly broken. :-(

And it gets worse. The example supplied by the Gentoo developer shows a
use case where the data directory is not even specified in the
configuration file but rather on the command line:

su -l postgres \
-c "env PGPORT=\"${PGPORT}\" ${PG_EXTRA_ENV} \
/usr/lib/postgresql-9.0/bin/pg_ctl \
start ${WAIT_FOR_START} -t ${START_TIMEOUT} -s -D ${DATA_DIR} \
-o '-D ${PGDATA} --data-directory=${DATA_DIR} \
--silent-mode=true ${PGOPTS}'"

In this case, dumping the postgresql.conf file settings is not going to
help --- there is nothing in the config directory that is going to point
us to the data directory --- it exists only in the process arguments.

Frankly, I am confused how this breakage has gone unreported for so long.

Our current TODO item is:

Allow pg_ctl to work properly with configuration files located outside
the PGDATA directory

pg_ctl can not read the pid file because it isn't located in the
config directory but in the PGDATA directory. The solution is to allow
pg_ctl to read and understand postgresql.conf to find the data_directory
value.

BUG #5103: "pg_ctl -w (re)start" fails with custom
unix_socket_directory

While this is accurate, it certainly is missing much of the breakage.
Finding a non-standard socket directory is the least of our problems
with config-only directories (even standard settings don't work), and
reading the config file is not enough of a solution because of the
possible passing of parameters on the command line.

To add even more complexity, imagine someone using the same config
directory for several data/cluster directories, and just passing a
unique --data-directory for each one on start --- in that case,
specifying the config directory is not sufficiently unique to specify
which data directory. It seems we would need some way to pass the data
directory to pg_ctl, perhaps via -o, but parsing that was something we
have tried to avoid (there may be no other choice), and it would have to
be supplied for start and stop.

The only conclusion I can come up with is that we need to be able to
dump postgresql.conf's data_directory, but also to read it from the
command line.

I am starting to question the value of config-only directories if pg_ctl
stop doesn't work, or you have to specify a different directory for
start and stop. Writing a second postmaster.pid file into the config
directory would help, but it would break with shared-config setups and I
don't think we can assume we have write permission on the config
directory.

What are config-only directories buying us that we can't get from
telling users to use symlinks and point to the data directory directly?

Did we not think of these things when we designed config-only
directories? I don't even see this problem mentioned in our
documentation.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-10-03 15:27:49 Re: Mismatch of relation names: pg_toast.pg_toast_nnn during pg_upgrade from 8.4 to 9.1
Previous Message Tom Lane 2011-10-03 15:21:02 Re: pg_dump issues