Re: Race conditions with checkpointer and shutdown

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race conditions with checkpointer and shutdown
Date: 2019-04-29 04:52:37
Message-ID: CA+hUKG+BXTnbRp5zUZJAKqnEi5ZD2WkUjb1sYxTjkSi79Pqe6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 28, 2019 at 12:56 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Even if that isn't the proximate cause of the current reports, it's
> clearly trouble waiting to happen, and we should get rid of it.
> Accordingly, see attached proposed patch. This just flushes the
> "immediate interrupt" stuff in favor of making sure that
> libpqwalreceiver.c will take care of any signals received while
> waiting for input.

+1

I see that we removed the code that this was modelled on back in 2015,
and in fact your patch even removes a dangling reference in a comment:

- * This is very much like what regular backends do with ImmediateInterruptOK,

> The existing code does not use PQsetnonblocking, which means that it's
> theoretically at risk of blocking while pushing out data to the remote
> server. In practice I think that risk is negligible because (IIUC) we
> don't send very large amounts of data at one time. So I didn't bother to
> change that. Note that for the most part, if that happened, the existing
> code was at risk of slow response to SIGTERM anyway since it didn't have
> Enable/DisableWalRcvImmediateExit around the places that send data.

Right.

> My thought is to apply this only to HEAD for now; it's kind of a large
> change to shove into the back branches to handle a failure mode that's
> not been reported from the field. Maybe we could back-patch after we
> have more confidence in it.

+1

That reminds me, we should probably also clean up at least the
ereport-from-signal-handler hazard identified over in this thread:

https://www.postgresql.org/message-id/CAEepm%3D10MtmKeDc1WxBM0PQM9OgtNy%2BRCeWqz40pZRRS3PNo5Q%40mail.gmail.com

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-04-29 06:30:18 Re: [PATCH v4] Add \warn to psql
Previous Message David Fetter 2019-04-29 04:19:02 Re: [PATCH v5] Show detailed table persistence in \dt+