Re: Proposal to add a QNX 6.5 port to PostgreSQL

From: "Baker, Keith [OCDUS Non-J&J]" <KBaker9(at)its(dot)jnj(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal to add a QNX 6.5 port to PostgreSQL
Date: 2014-07-31 22:50:52
Message-ID: 25171C9D43848A4A9FFF65373179D8025AC0E289@ITSUSRAGMDGD05.jnj.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I will on vacation until August 11, I look forward to any progress you are able to make.

Since ensuring there are not orphaned back-end processes is vital, could we add a check for getppid() == 1 ?
Patch below seemed to work on QNX (first client command after a kill -9 of postmaster resulted in exit of its associated server process).

diff -rdup postgresql-9.3.5/src/backend/tcop/postgres.c postgresql-9.3.5_qnx/src/backend/tcop/postgres.c
--- postgresql-9.3.5/src/backend/tcop/postgres.c 2014-07-21 15:10:42.000000000 -0400
+++ postgresql-9.3.5_qnx/src/backend/tcop/postgres.c 2014-07-31 18:17:40.000000000 -0400
@@ -3967,6 +3967,14 @@ PostgresMain(int argc, char *argv[],
*/
firstchar = ReadCommand(&input_message);

+#ifndef WIN32
+ /* Check for death of parent */
+ if (getppid() == 1)
+ ereport(FATAL,
+ (errcode(ERRCODE_CRASH_SHUTDOWN),
+ errmsg("Parent server process has exited")));
+#endif
+
/*
* (4) disable async signal conditions again.
*/

Keith Baker

> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> Sent: Thursday, July 31, 2014 12:58 PM
> To: Tom Lane
> Cc: Baker, Keith [OCDUS Non-J&J]; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
>
> On Wed, Jul 30, 2014 at 11:02 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > So it seems like we could possibly go this route, assuming we can
> > think of a variant of your proposal that's race-condition-free. A
> > disadvantage compared to a true file lock is that it would not protect
> > against people trying to start postmasters from two different NFS
> > client machines --- but we don't have protection against that now.
> > (Maybe we could do this *and* do a regular file lock to offer some
> > protection against that case, even if it's not bulletproof?)
>
> That's not a bad idea. By the way, it also wouldn't be too hard to test at
> runtime whether or not flock() has first-close semantics. Not that we'd want
> this exact design, but suppose you configure shmem_interlock=flock in
> postgresql.conf. On startup, we test whether flock is reliable, determine
> that it is, and proceed accordingly.
> Now, you move your database onto an NFS volume and the semantics
> change (because, hey, breaking userspace assumptions is fun) and try to
> restart up your database, and it says FATAL: flock() is broken.
> Now you can either move the database back, or set shmem_interlock to
> some other value.
>
> Now maybe, as you say, it's best to use multiple locking protocols and hope
> that at least one will catch whatever the dangerous situation is.
> I'm just trying to point out that we need not blindly assume the semantics we
> want are there (or that they are not); we can check.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-08-01 01:51:29 Re: Proposal to add a QNX 6.5 port to PostgreSQL
Previous Message Stephen Frost 2014-07-31 21:34:09 Re: pgaudit - an auditing extension for PostgreSQL