Re: libpq bad async behaviour

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Daurnimator <quae(at)daurnimator(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: libpq bad async behaviour
Date: 2015-01-14 13:40:28
Message-ID: 20150114134028.GN5245@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-01-14 08:32:19 -0500, Robert Haas wrote:
> On Fri, Jan 9, 2015 at 2:57 PM, Daurnimator <quae(at)daurnimator(dot)com> wrote:
> > I'm worried about libpq blocking in some circumstances; particularly
> > around SSL renegotiations.
> > This came up while writing an async postgres library for lua, I
> > realised that this code was dangerous:
> > https://github.com/daurnimator/cqueues-pgsql/blob/ee9c3fc85c94669b8128386d99d730fe93d9dbad/cqueues-pgsql.lua#L121
> >
> >
> > e.g. 1:
> > When a PQ connection is in non-blocking mode, PQflush returns 1, the docs say:
> >> wait for the socket to be write-ready and call it again
> > However, if the SSL layer is waiting on data for a renegotiation,
> > write readiness is not enough:
> > Waiting for POLLOUT and calling PQflush again will (untested) just
> > return 1 again, and continue to do so until data is recieved.
> > This is a busy-loop, and will block the host application.
> >
> > e.g. 2:
> > An SSL renegiation happens while trying to receive a response.
> > According to 'andres' on IRC, inside of `PQisBusy` there is a busy loop:
> >> 14:22:32 andres You'll not see that. Even though the explanation for it is absolutely horrid.
> >> 14:23:32 andres There's a busy retry loop because of exactly that reason inside libpq's ssl read function whenever it hits a WANT_WRITE.
> >> 14:23:58 daurnimator so... libpq will block my process? :(
> >> 14:24:25 andres daurnimator: That case is unlikely to be hit often luckily because of the OS buffering. But yea, it's really unsatisfying.
> >> 14:26:06 andres daurnimator: I think this'll need a new API to be properly fixed.
> >
> >
> > One idea that came to mind if we want to keep the same api, is to hide
> > the socket behind an epoll file descriptor,
> > they always poll read ready when an fd in their set becomes ready.
> > I think this is also possible for kqueue on bsd, ports on solaris and
> > IOCP on windows.

I think that kind of solution isn't likely to be satisfying. The amount
of porting work is just not going to be worth the cost. And it won't be
easily hideable in the API at all as the callers will expect a normal
fd.

> Yeah, this is a problem. Fixing it isn't easy, though, I think.

I think
extern PostgresPollingStatusType PQconnectPoll(PGconn *conn);
has the right interface. It returns what upper layers need to wait
for. I think we should extend pretty much that to more interfaces. This
likely means that we'll need extended versions of PQFlush() and
PQconsumeInput() - afaics it shouldn't be much more?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-01-14 13:43:04 Re: Turning recovery.conf into GUCs
Previous Message Robert Haas 2015-01-14 13:36:03 Re: Possible typo in create_policy.sgml