Re: Avoiding data loss with synchronous replication

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding data loss with synchronous replication
Date: 2021-07-23 11:22:42
Message-ID: eba1bc66065a684f77035c348a7bec7db7df8ca6.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2021-07-22 at 21:17 +0000, Bossart, Nathan wrote:
> As previously discussed [0], canceling synchronous replication waits
> can have the unfortunate side effect of making transactions visible on
> a primary server before they are replicated. A failover at this time
> would cause such transactions to be lost.
>
> AFAICT there are a variety of ways that the aforementioned problem may
> occur:
> 4. Query cancellations and backend terminations: This appears to be
> the only gap where there is no way to avoid potential data loss,
> and it is the main target of my proposal.
>
> Instead of blocking query cancellations and backend terminations, I
> think we should allow them to proceed, but we should keep the
> transactions marked in-progress so they do not yet become visible to
> sessions on the primary. Once replication has caught up to the
> the necessary point, the transactions can be marked completed, and
> they would finally become visible.
>
> The main advantages of this approach are 1) it still allows for
> canceling waits for synchronous replication and 2) it provides an
> opportunity to view and manage waits for synchronous replication
> outside of the standard cancellation/termination functionality. The
> tooling for 2 could even allow a session to begin waiting for
> synchronous replication again if it "inadvertently interrupted a
> replication wait..." [4]. I think the main disadvantage of this
> approach is that transactions committed by a session may not be
> immediately visible to the session when the command returns after
> canceling the wait for synchronous replication. Instead, the
> transactions would become visible in the future once the change is
> replicated. This may cause problems for an application if it doesn't
> handle this scenario carefully.
>
> What are folks' opinions on this idea? Is this something that is
> worth prototyping?

But that would mean that changes ostensibly rolled back (because the
cancel request succeeded) will later turn out to be committed after all,
just like it is now (only later). Where is the advantage?

Besides, there is no room for another transaction status in the
commit log.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2021-07-23 11:32:08 Re: Avoiding data loss with synchronous replication
Previous Message Ranier Vilela 2021-07-23 11:07:10 Re: Signed vs Unsigned (take 2) (src/backend/storage/ipc/procarray.c)