Re: Disallow cancellation of waiting for synchronous replication

From: Maksim Milyutin <milyutinma(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disallow cancellation of waiting for synchronous replication
Date: 2019-12-28 23:19:28
Message-ID: 4e3cac62-1f04-13fa-90c2-3bc071ac9ccb@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29.12.2019 00:55, Robert Haas wrote:

> On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>> Currently, we can have split brain with the combination of following steps:
>> 0. Setup cluster with synchronous replication. Isolate primary from standbys.
>> 1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING
>> 2. CANCEL 1 during wait for synchronous replication
>> 3. Retry 1. Idempotent query will succeed and client have confirmation of written data, while it is not present on any standby.
> All that being said, like Tom and Michael, I don't think teaching the
> backend to ignore cancels is the right approach. We have had
> innumerable problems over the years that were caused by the backend
> failing to respond to cancels when we would really have liked it to do
> so, and users REALLY hate it when you tell them that they have to shut
> down and restart (with crash recovery) the entire database because of
> a single stuck backend.
>

The stuckness of backend is not deadlock here. To cancel waiting of
backend fluently, client is enough to turn off synchronous replication
(change synchronous_standby_names through server reload) or change
synchronous replica to another livable one (again through changing of
synchronous_standby_names). In first case he explicitly agrees with
existence of local (not replicated) commits in master.

--
Best regards,
Maksim Milyutin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-12-28 23:34:33 Re: xact_start for walsender & logical decoding not updated
Previous Message Tom Lane 2019-12-28 22:52:14 Re: TAP testing for psql's tab completion code