Re: Race conditions with checkpointer and shutdown

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Race conditions with checkpointer and shutdown
Date: 2019-04-29 16:35:11
Message-ID: 20190429163511.7rbdb7gmlz634dc4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-04-27 20:56:51 -0400, Tom Lane wrote:
> Even if that isn't the proximate cause of the current reports, it's
> clearly trouble waiting to happen, and we should get rid of it.
> Accordingly, see attached proposed patch. This just flushes the
> "immediate interrupt" stuff in favor of making sure that
> libpqwalreceiver.c will take care of any signals received while
> waiting for input.

Good plan.

> The existing code does not use PQsetnonblocking, which means that it's
> theoretically at risk of blocking while pushing out data to the remote
> server. In practice I think that risk is negligible because (IIUC) we
> don't send very large amounts of data at one time. So I didn't bother to
> change that. Note that for the most part, if that happened, the existing
> code was at risk of slow response to SIGTERM anyway since it didn't have
> Enable/DisableWalRcvImmediateExit around the places that send data.

Hm, I'm not convinced that's OK. What if there's a network hickup? We'll
wait until there's an OS tcp timeout, no? It's bad enough that there
were cases of this before. Increasing the surface of cases where we
might want to shut down walreceiver, e.g. because we would rather switch
to recovery_command, or just shut down the server, but just get stuck
waiting for an hour for a tcp timeout, doesn't seem OK.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rob 2019-04-29 16:40:01 CHAR vs NVARCHAR vs TEXT performance
Previous Message Laurenz Albe 2019-04-29 16:28:39 Re: Identity columns should own only one sequence