Hot standby fails if any backend crashes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Hot standby fails if any backend crashes
Date: 2012-02-02 23:27:44
Message-ID: 6502.1328225264@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm currently working with Duncan Rance's test case for bug #6425, and
I am observing a very nasty behavior in HEAD: once one of the
hot-standby query backends crashes, the standby postmaster SIGQUIT's
all its children and then just quits itself, with no log message and
apparently no effort to restart. Surely this is not intended? The
log shows

TRAP: FailedAssertion("!(((lpp)->lp_flags == 1))", File: "heapam.c", Line: 735)
2012-02-02 18:02:39.985 EST 29363 LOG: server process (PID 15238) was terminated by signal 6: Aborted
2012-02-02 18:02:39.985 EST 29363 DETAIL: Failed process was running: SELECT * FROM repro_02_ref;
2012-02-02 18:02:39.985 EST 29363 LOG: terminating any other active server processes
2012-02-02 18:02:39.985 EST 15214 WARNING: terminating connection because of crash of another server process
2012-02-02 18:02:39.985 EST 15214 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2012-02-02 18:02:39.985 EST 15214 HINT: In a moment you should be able to reconnect to the database and repeat your command.
2012-02-02 18:02:39.985 EST 15213 WARNING: terminating connection because of crash of another server process
2012-02-02 18:02:39.985 EST 15213 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2012-02-02 18:02:39.985 EST 15213 HINT: In a moment you should be able to reconnect to the database and repeat your command.
[ repeat the above for what I assume are all the child processes ]

... and then nothing. The standby postmaster is no longer running and
there are no log messages from it after the "terminating any other
active server processes" one. No core dump from it, either.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-02-02 23:46:47 Re: JSON output functions.
Previous Message Tom Lane 2012-02-02 23:19:36 Re: Patch: Allow SQL-language functions to reference parameters by parameter name