Re: Disallow cancellation of waiting for synchronous replication

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Marco Slot <marco(at)citusdata(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, dim(at)tapoueh(dot)org
Subject: Re: Disallow cancellation of waiting for synchronous replication
Date: 2019-12-21 15:11:17
Message-ID: 8EE57880-2329-4F03-85D9-107FF419687D@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 21 дек. 2019 г., в 2:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> написал(а):
>
> Andrey Borodin <x4mmm(at)yandex-team(dot)ru> writes:
>> I think proper solution here would be to add GUC to disallow cancellation of synchronous replication.
>
> This sounds entirely insane to me. There is no possibility that you
> can prevent a failure from occurring at this step.
Yes, we cannot prevent failure. If we wait here for a long time and someone cancels query, probably, node is failed. Database already lives in some other Availability Zone.
All we should do - refuse to commit anything here. Any committed data will be lost.

>> Three is still a problem when backend is not canceled, but terminated [2].
>
> Exactly. If you don't have a fix that handles that case, you don't have
> anything. In fact, you've arguably made things worse, by increasing the
> temptation to terminate or "kill -9" the nonresponsive session.
Currently, any Postgres HA solution can loose data when application issues INSERT ... ON CONFLICT DO NOTHING with retry. There is no need for any DBA mistake. Just a driver capable of issuing cancel on timeout.

Administrator issuing kill -9 is OK, database must shutdown to prevent splitbrain. Preferably, database should refuse to start after shutdown.
I'm not proposing this behavior as default. If administrator (or HA tool) configured DB in this mode - probably they know what they are doing.

> 21 дек. 2019 г., в 15:34, Marco Slot <marco(at)citusdata(dot)com> написал(а):
>
> On Fri, Dec 20, 2019 at 11:07 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>> I think changing synchronous_standby_names to some available standbys will resume all backends waiting for synchronous replication.
>> Do we need to check necessity of synchronous replication in any other case?
>
> The GUCs are not re-checked in the main loop in SyncRepWaitForLSN, so
> backends will remain stuck there even if synchronous replication has
> been (temporarily) disabled while they were waiting.
SyncRepInitConfig() will be called on SIGHUP. Backends waiting for synchronous replication will wake up on WAIT_EVENT_SYNC_REP and happily succeed.

> I do agree with the general sentiment that terminating the connection
> is preferable over sending a response to the client (except when
> synchronous replication was already disabled). Synchronous replication
> does not guarantee that a committed write is actually on any replica,
> but it does in general guarantee that a commit has been replicated
> before sending a response to the client. That's arguably more
> important because the rest of what the application might depend on the
> transaction completing and replicating successfully. I don't know of
> cases other than cancellation in which a response is sent to the
> client without replication when synchronous replication is enabled.
>
> The error level should be FATAL instead of PANIC, since PANIC restarts
> the database and I don't think there is a reason to do that.

Terminating connection is unacceptable, actually. Client will retry idempotent query. This query now do not need to write anything and will be committed.
We need to shutdown database and prevent it from starting. We should not acknowledge any data before synchronous replication configuration allows us.

When client tries to cancel his query - we refuse to do so and hold his write locks. If anyone terminate connection - locks will be released. It is better to shut down whole DB, then release these locks and continue to receive queries.

All this does not apply to simple cases when user accidentally enabled synchronous replication. This is a setup for quite sophisticated HA tool, which will rewind local database, when transient network partition will be over and old timeline is archived, and attach it to new primary.

Best regards, Andrey Borodin.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-12-21 15:13:43 Re: Optimizing TransactionIdIsCurrentTransactionId()
Previous Message Marco Slot 2019-12-21 10:34:05 Re: Disallow cancellation of waiting for synchronous replication