Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-05-10 12:05:25
Message-ID: CALj2ACXePMrQF894xZH3zy4i-3VK-ufxvEdUAMRGg=iUcJ348w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear hackers,
>
> I rebased and refined my PoC. Followings are the changes:

Thanks.

Apologies for being late here. Please bear with me if I'm repeating
any of the discussed points.

I'm mainly trying to understand the production level use-case behind
this feature, and for that matter, recovery_min_apply_delay. AFAIK,
people try to keep the replication lag as minimum as possible i.e.
near zero to avoid the extreme problems on production servers - wal
file growth, blocked vacuum, crash and downtime.

The proposed feature commit message and existing docs about
recovery_min_apply_delay justify the reason as 'offering opportunities
to correct data loss errors'. If someone wants to enable
recovery_min_apply_delay/min_apply_delay on production servers, I'm
guessing their values will be in hours, not in minutes; for the simple
reason that when a data loss occurs, people/infrastructure monitoring
postgres need to know it first and need time to respond with
corrective actions to recover data loss. When these parameters are
set, the primary server mustn't be generating too much WAL to avoid
eventual crash/downtime. Who would really want to be so defensive
against somebody who may or may not accidentally cause data loss and
enable these features on production servers (especially when these can
take down the primary server) and live happily with the induced
replication lag?

AFAIK, PITR is what people use for recovering from data loss errors in
production.

IMO, before we even go implement the apply delay feature for logical
replication, it's worth to understand if induced replication lags have
any production level significance. We can also debate if providing
apply delay hooks is any better with simple out-of-the-box extensions
as opposed to the core providing these features.

Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2023-05-10 12:24:19 Re: Memory leak from ExecutorState context?
Previous Message Michael Paquier 2023-05-10 12:04:38 Re: WAL Insertion Lock Improvements