Re: Detecting libpq connections improperly shared via fork()

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detecting libpq connections improperly shared via fork()
Date: 2012-10-09 10:05:34
Message-ID: CAAZKuFbL-RA1xNOhR=u808+zPV-j_cEHd-GpCC=WjWR=bFw9yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 9, 2012 at 2:51 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On Thursday, October 04, 2012 03:23:54 AM Tom Lane wrote:
>> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> > On Wed, Oct 3, 2012 at 3:14 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
>> >> Hm. An easier version of this could just be storing the pid of the
>> >> process that did the PQconnectdb* in the PGconn struct. You can then
>> >> check that PGconn->pid == getpid() at relatively few places and error
>> >> out on a mismatch. That should be doable with only minor overhead.
>> >
>> > I suppose this might needlessly eliminate someone who forks and hands
>> > off the PGconn struct to exactly one child, but it's hard to argue
>> > with its simplicity and portability of mechanism.
>>
>> Yeah, the same thing had occurred to me, but I'm not sure that getpid()
>> is really cheap enough to claim that the overhead is negligible.
> I guess its going to be os/libc dependant. In glibc systems getpid() should be
> basically just be a function call (no syscall).

To protect users who fork but then thoroughly forget about the
connection in either the parent or child process, the original sketch
I had in mind (which did not use getpid) would be to
increment-and-check a monotonic number of some kind when protocol
traffic is initiated (effectively "tell" on the socket). If that
shared state is incremented in an unexpected way, then it is known
that another process somewhere has mucked with the protocol state, and
it's time to deliver a lucid error.

That means both a shared (such as an anonymous mmap) and a not-shared
(process-local as per most things when forking, or in the case of
threads thread-local) state would be required. Both halves have
thorny portability problems AFAIK, so I was somewhat hesitant to bring
it up.

However, I would like to re-iterate that this is a very common
problem, so it may be worth pausing to think about solving it.
Whenever someone writes in saying "what's up with these strange SSL
errors", generally the first question in response is "are you using
unicorn?" (for Ruby, 'gunicorn' for Python). The answer is almost
invariably yes. The remainder have renegotiation issues.

--
fdr

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2012-10-09 11:20:44 Re: Move postgresql_fdw_validator into dblink
Previous Message Andres Freund 2012-10-09 09:51:32 Re: Detecting libpq connections improperly shared via fork()