Re: postgres_fdw uninterruptible during connection establishment / ProcSignalBarrier

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: postgres_fdw uninterruptible during connection establishment / ProcSignalBarrier
Date: 2022-12-09 02:08:15
Message-ID: 20221209020815.wsoixct72c6hst5l@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I just re-discovered this issue, via
https://www.postgresql.org/message-id/20221209003607.bz2zdznvfnkq4zz3%40awork3.anarazel.de

On 2022-09-25 16:22:37 -0700, Andres Freund wrote:
> Maybe I am missing something, but I don't think it's OK for
> connect_pg_server() to connect in a blocking manner, without accepting
> interrupts?

It's definitely not. This basically means network issues or such can lead to
connections being unkillable...

We know how to do better, c.f. libpqrcv_connect(). I hacked that up for
postgres_fdw, and got through quite a few runs without related issues ([1]).

The same problem is present in two places in dblink.c. Obviously we can copy
and paste the code to dblink.c as well. But that means we have the same not
quite trivial code in three different c files. There's already a fair bit of
duplicated code around AcquireExternalFD().

It seems we should find a place to put backend specific libpq wrapper code
somewhere. Unless we want to relax the rule about not linking libpq into the
backend we can't just put it in the backend directly, though.

The only alternative way to provide a wrapper that I can think of are to
a) introduce a new static library that can be linked to by libpqwalreceiver,
postgres_fdw, dblink
b) add a header with static inline functions implementing interrupt-processing
connection establishment for libpq

Neither really has precedent.

The attached quick patch just adds and uses libpq_connect_interruptible() in
postgres_fdw. If we wanted to move this somewhere generic, at least part of
the external FD handling should also be moved into it.

dblink.c uses a lot of other blocking libpq functions, which obviously also
isn't ok.

Perhaps we could somehow make this easier from within libpq? My first thought
was a connection parameter that'd provide a different implementation of
pqSocketCheck() or such - but I don't think that'd work, because we might
throw errors when processing interrupts, and that'd not be ok from deep within
libpq.

Greetings,

Andres Freund

[1] I eventually encountered a deadlock in REINDEX, but it didn't involve
postgres_fw / ProcSignalBarrier

Attachment Content-Type Size
postgres_fdw_libpq_connect.diff text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-12-09 02:15:38 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Masahiko Sawada 2022-12-09 01:19:29 Re: [PoC] Improve dead tuple storage for lazy vacuum