Re: Protecting against multiple instances per cluster

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Protecting against multiple instances per cluster
Date: 2011-09-08 19:02:32
Message-ID: 26483.1315508552@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thom Brown <thom(at)linux(dot)com> writes:
> I've come across a PostgreSQL set up where there are 2 servers, each
> with the same version of PostgreSQL on, both mounting the same SAN
> onto their respective file systems. It was intended that only 1 of
> the servers would be running an instance of PostgreSQL at a time as
> they both point to the same pgdata. This was dubbed a "high
> availability" set up, where if one server went down, they could start
> PostgreSQL on the other. (yes, I know what you're thinking)

Multiply by about ten and you might have an idea what I'm thinking.

> Now
> normally there is protection against 2 instances running only if the
> instances on the same server as it would reference shared memory. But
> in this case, neither server has access to the other's shared memory,
> so it has to rely on the pid file. But the pid file isn't enough by
> itself.

The pid file is not merely "not enough", it's entirely useless, since
the two machines aren't sharing a process ID space. If somebody starts
a postmaster on machine 2, it will almost certainly see the pid in the
pidfile as not running (on machine 2). So you have no safety interlock
whatsoever in this configuration.

It is possible to build configurations of this type safely, but you need
some external dead-man-switch or STONITH arrangement that forcibly kills
machine 1 (or at least disconnects it from the SAN) before machine 2 is
allowed to start. Postgres can't do it on its own, and as you've
undoubtedly already found out, human DBAs can't be trusted to get it
right either.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-09-08 19:10:24 Re: Large C files
Previous Message Thom Brown 2011-09-08 18:40:44 Protecting against multiple instances per cluster