Re: Disallow cancellation of waiting for synchronous replication

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Maksim Milyutin <milyutinma(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disallow cancellation of waiting for synchronous replication
Date: 2020-01-02 17:26:16
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> 2 янв. 2020 г., в 19:13, Robert Haas <robertmhaas(at)gmail(dot)com> написал(а):
> On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>> Not loosing data - is a nice property of the database either.
> Sure, but there's more than one way to fix that problem, as I pointed
> out in my first response.
Sorry, it took some more reading iterations of your message for me to understand the problem you are writing about.

You proposed two solutions:
1. Client analyze warning an understand that data is not actually committed. This, as you pointed out, does not solve the problem: data is lost for another client, who never saw the warning.
Actually, "client" is a stateless number of connections unable to communicate with each other by any means beside database. They cannot share information about not committed transactions (they would need a database, thus chicken and the egg problem).

2. Add another message "CANCEL --force" to stop synchronous replication for specific backend.
We already have a way to stop synchronous replication "alter system set synchronous_standby_names to ''; select pg_reload_conf();". This will stop it for every backend, but "CANCEL --force" will be more picky.
User still can loose data when they issue idempotent query based on data, committed by "CANCEL --force". Moreover, user can loose data if his upsert is based on data committed by someone else with "set synchronous_commit to off".
We could fix upserts: make them wait for replication even if nothing was changed, but this will not cover the case when user is doing SELECT and decides not to insert anything.
We can fix SELECT: if user asks for synchronous_commit=remote_write - give him snapshot no newer than synchronously committed data. ISTM this would solve all above problems, but I do not see implications of this approach. We should add all XIDs to XIP if their commit LSN > sync rep LSN. But I'm not sure all other transactional mechanics will be OK with this.

From practical point of view - when all writing queries use same synchronous_commit level - easiest solution is to just disallow cancel of sync replication. In psql we can just reset connection on second CTRL+C. That's more generic than "CANCEL --force".

When all queries runs with same synchronous_commit there is no point in protocol message for canceling sync rep for single connection. Just drop that connection. Ignoring cancel is the only way to satisfy synchronous_commit level, which is constant for transaction.
When queries run in various synchronous_commit - things are much more complicated. Adding protocol message to change synchronous_commit for running queries does not seems to be a viable option.

> I continue to think that the root cause of this issue is that we can't
> distinguish between cancelling the query and cancelling the sync rep
> wait.
Yes, it is. But canceling sync rep wait exists already. Just change synchronous_stanby_names. Canceling sync rep for one client - is, effectively, changing synchronous commit level for running transaction. It opens a way for way more difficult complications.

> The client in this case is asking for both when it really only
> wants the former, and then ignoring the warning that the latter is
> what actually occurred.
Client is not ignoring warnings. Data is lost for the client which never received warning. If we could just fix our code, I would not be making so much noise. There are workarounds, but they are very pleasant to explain.

Best regards, Andrey Borodin.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-01-02 17:37:13 Re: Building infrastructure for B-Tree deduplication that recognizes when opclass equality is also equivalence
Previous Message rmrodriguez 2020-01-02 17:18:54 Re: avoid some calls to memset with array initializer