RE: Synchronizing slots from primary to standby

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2024-04-02 08:41:21
Message-ID: OS0PR01MB57168CDFE67D9ABDA24FF8DB943E2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, April 2, 2024 3:21 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> On Tuesday, April 2, 2024 8:35 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Monday, April 1, 2024 7:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > >
> > > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com>
> > > wrote:
> > > >
> > > > On Friday, March 29, 2024 2:50 PM Amit Kapila
> > > > <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > > >
> > > >
> > > > >
> > > > >
> > > > > 2.
> > > > > +extern XLogRecPtr
> > > > > +pg_logical_replication_slot_advance(XLogRecPtr
> > > moveto,
> > > > > + bool *found_consistent_point);
> > > > > +
> > > > >
> > > > > This API looks a bit awkward as the functionality doesn't match
> > > > > the name. How about having a function with name
> > > > > LogicalSlotAdvanceAndCheckReadynessForDecoding(moveto,
> > > > > ready_for_decoding) with the same functionality as your patch
> > > > > has for
> > > > > pg_logical_replication_slot_advance() and then invoke it both
> > > > > from pg_logical_replication_slot_advance and slotsync.c. The
> > > > > function name is too big, we can think of a shorter name. Any ideas?
> > > >
> > > > How about LogicalSlotAdvanceAndCheckDecodingState() Or just
> > > > LogicalSlotAdvanceAndCheckDecoding()?
> > > >
> > >
> > > It is about snapbuild state, so how about naming the function as
> > > LogicalSlotAdvanceAndCheckSnapState()?
> >
> > It looks better to me, so changed.
> >
> > >
> > > I have made quite a few cosmetic changes in comments and code. See
> > > attached. This is atop your latest patch. Can you please review and
> > > include these changes in the next version?
> >
> > Thanks, I have reviewed and merged them.
> > Attach the V5 patch set which addressed above comments and ran pgindent.
>
> I added one test in 040_standby_failover_slots_sync.pl in 0002 patch, which can
> reproduce the data loss issue consistently on my machine. It may not
> reproduce in some rare cases if concurrent xl_running_xacts are written by
> bgwriter, but I think it's still valuable if it can verify the fix in most cases. The test
> will fail if directly applied on HEAD, and will pass after applying atop of 0001.

CFbot[1] complained about one query result's order in the tap-test, so I am
attaching a V7 patch set which fixed this. There are no changes in 0001.

[1] https://cirrus-ci.com/task/6375962162495488

Best Regards,
Hou zj

Attachment Content-Type Size
v7-0002-test-the-data-loss-case.patch application/octet-stream 4.9 KB
v7-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch application/octet-stream 22.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-04-02 08:49:30 Re: Synchronizing slots from primary to standby
Previous Message Thomas Munro 2024-04-02 08:39:49 Re: Streaming I/O, vectored I/O (WIP)