RE: Conflict detection for update_deleted in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Conflict detection for update_deleted in logical replication
Date: 2025-07-07 10:21:35
Message-ID: OS0PR01MB5716EDE22A462A8CD8CDD01A944FA@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 7, 2025 at 11:03 AM Zhijie Hou (Fujitsu) wrote:
>
> On Sun, Jul 6, 2025 at 10:51 PM Masahiko Sawada wrote:
> ================================================
> > ======================
> > > ==
> > > The workload is mostly same as [4].
> > >
> > > Workload:
> > > - Initially ran pgbench with 40 clients for the *both side*.
> > > - Set max_conflict_retention_duration = {60, 120}
> > > - When the slot is invalidated on the subscriber side, stop the
> > > benchmark
> > and
> > > wait until the subscriber would be caught up. Then the number of
> > > clients
> > on
> > > the publisher would be half.
> > > In this test the conflict slot could be invalidated as expected
> > > when the
> > workload
> > > on the publisher was high, and it would not get invalidated anymore
> after
> > > reducing the workload. This shows even if the slot has been
> > > invalidated
> > once,
> > > users can continue to detect the update_deleted conflict by reduce the
> > > workload on the publisher.
> > > - Total period of the test was 900s for each cases.
> > >
> > > (max_conflixt.tar.gz can run the same workload)
> > >
> > > Observation:
> > > -
> > > - Parallelism of the publisher side is reduced till 15->7->3 and finally the
> > > conflict slot is not invalidated.
> > > - TPS on the subscriber side is improved when the concurrency was
> > reduced.
> > > This is because the dead tuple accumulation is reduced on
> > > subscriber
> > due to
> > > the reduced workload on the publisher.
> > > - when publisher has Nclients=3, no regression in subscriber's TPS
> >
> > I think that users typically cannot control the amount of workloads in
> > production, meaning that once the performance regression starts to
> > happen the subscriber could enter the loop where invalidating the
> > slot, recovreing the performance, creating the slot, and having the
> performance problem.
>
> Yes, you are right. The test is designed to demonstrate that the slot can be
> invalidated under high workload conditions as expected, while remaining valid
> if the workload is reduced. In production systems where workload reduction
> may not be possible, it’s recommended to disable `retain_conflict_info` to
> enhance performance. This decision involves balancing the need for reliable
> conflict detection with optimal system performance.
>
> I think the hot standby feedback also has a similar impact on the performance
> of the primary, which is done to prevent the early removal of data necessary for
> the standby, ensuring that it remains accessible when needed.

For reference, we conducted test[1] to evaluate the impact of enabling hot
standby feedback in a physical replication setup, observing approximately
a 50% regression in TPS on the primary as well.

[1] https://www.postgresql.org/message-id/CABdArM4OEwmh_31dQ8_F__VmHwk2ag_M%3DYDD4H%2ByYQBG%2BbHGzg%40mail.gmail.com

Best Regards,
Hou zj

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2025-07-07 10:31:25 RE: Conflict detection for update_deleted in logical replication
Previous Message Ajin Cherian 2025-07-07 10:15:20 Re: 024_add_drop_pub.pl might fail due to deadlock