> On Thu, 28 Aug 2008, Scott Marlowe wrote:
>> scenario 1: There's a postmaster, it owns all the child processes.
>> It gets killed. The Postmaster gets restarted. Since there isn't one
> when the postmaster gets killed doesn't that kill all it's children as
Of course not. The postmaster gets a SIGKILL, which is instant death.
There's no way to signal the children. If they were killed too then
this wouldn't be much of a problem.
>> running, it comes up. starts new child processes. Meanwhile, the old
>> child processes that don't belong to it are busy writing to the data
>> store. Instant corruption.
> if so then the postmaster should not only check if there is an existing
> postmaster running, it should check for the presense of the child
> processes as well.
See my other followup. There's limited things it can check, but against
sysadmin stupidity there's no silver bullet.
> well, if you aren't going through the postmaster, what process is
> recieving network messages? it can't be a group of processes, only one
> can be listening to a socket at one time.
Huh? Each backend has its own socket.
> and if the postmaster isn't needed for the child processes to write to
> the datastore, how are multiple child processes prevented from writing to
> the datastore normally? and why doesn't that mechanism continue to work?
They use locks. Those locks are implemented using shared memory. If a
new postmaster starts, it gets a new shared memory, and a new set of
locks, that do not conflict with the ones already held by the first gang
of backends. This is what causes the corruption.
> so are you saying that the only possible thing that can kill the
> postmaster is the OOM killer? it can't possilby exit in any other
> situation without the children being shutdown first?
> I would be surprised if that was really true.
If the sysadmin sends a SIGKILL then obviously the same thing happens.
Any other signal gives it the chance to signal the children before
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
In response to
pgsql-performance by date
|Next:||From: david||Date: 2008-08-29 03:02:48|
|Subject: Re: select on 22 GB table causes "An I/O error occured
while sending to the backend." exception|
|Previous:||From: Greg Smith||Date: 2008-08-29 02:43:54|
|Subject: Re: How to setup disk spindles for best performance|