Re: postgres_fdw uninterruptible during connection establishment / ProcSignalBarrier

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <fujii(at)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: Re: postgres_fdw uninterruptible during connection establishment / ProcSignalBarrier
Date: 2023-01-03 20:05:20
Message-ID: 20230103200520.di5hjebqvi72coql@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-12-30 14:14:55 +1300, Thomas Munro wrote:
> Oh, I was imagining something slightly different. Not something under
> src/include/libpq, but conceptually a separate header-only library
> that is above both the backend and libpq. Maybe something like
> src/include/febe_util/libpq_connect_interruptible.h. In other words,
> I thought your idea b was a header-only version of your idea a. I
> think that might be a bit nicer than putting it under libpq?
> Superficial difference, perhaps...

It doesn't seem entirely right to introduce a top-level "module" for
libpq-in-extensions to me - we don't do that for other APIs used for
extensions. But a header only library also doesn't quite seem right. So ...

Looking at turning my patch upthread into something slightly less prototype-y,
I noticed that libpqwalreceiver doesn't do AcquireExternalFD(), added to other
backend uses of libpq in 3d475515a15. It's unlikely to matter for
walreceiver.c itelf, but it seems problematic for the logical replication
cases?

It's annoying to introduce PG_TRY/CATCH blocks, just to deal with the
potential for errors inside WaitLatchOrSocket(), which should never happen. I
wonder if we should consider making latch.c error out fatally, instead of
elog(ERROR). If latches are broken, things are bad.

The PG_CATCH() logic in postgres_fdw's GetConnection() looks quite suspicious
to me. It looks like 32a9c0bdf493 took entirely the wrong path. Instead of
optionally not throwing or directly re-establishing connections in
begin_remote_xact(), the PG_CATCH() hammer was applied.

The attached patch adds libpq-be-fe-helpers.h and uses it in libpqwalreceiver,
dblink, postgres_fdw.

As I made libpq-be-fe-helpers.h handle reserving external fds,
libpqwalreceiver now does so. I briefly looked through its users without
seeing cases of leaking in case of errors - which would already have been bad,
since we'd already have leaked a libpq connection/socket.

Given the lack of field complaints and the size of the required changes, I
don't think we should backpatch this, even though it's pretty clearly buggy
as-is.

Some time back Thomas had a patch to introduce a wrapper around
libpq-in-extensions that fixed issues cause by some events being
edge-triggered on windows. It's possible that combining these two efforts
would yield something better. I resisted the urge to create a wrapper around
each connection in this patch, as it'd have ended up being a whole lot more
invasive. But... Thomas, do you have a reference to that code handy?

Greetings,

Andres Freund

Attachment Content-Type Size
v2-0001-wip-don-t-block-inside-PQconnectdb-et-al.patch text/x-diff 16.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-01-03 20:11:47 Re: fixing CREATEROLE
Previous Message Gilles Darold 2023-01-03 19:46:47 Re: [PATCH] pg_dump: lock tables in batches