RE: Race condition in FetchTableStates() breaks synchronization of subscription tables

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Race condition in FetchTableStates() breaks synchronization of subscription tables
Date: 2024-03-13 04:42:42
Message-ID: OS0PR01MB5716E39F0722A1152640AE99942A2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, March 13, 2024 11:49 AMvignesh C <vignesh21(at)gmail(dot)com> wrote:
> On Tue, 12 Mar 2024 at 09:34, Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> >
> >
> > On Tue, Mar 12, 2024 at 2:59 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >>
> >>
> >> Thanks, I have created the following Commitfest entry for this:
> >> https://commitfest.postgresql.org/47/4816/
> >>
> >> Regards,
> >> Vignesh
> >
> >
> > Thanks for the patch, I have verified that the fix works well by following the
> steps mentioned to reproduce the problem.
> > Reviewing the patch, it seems good and is well documented. Just one minor
> comment I had was probably to change the name of the variable
> table_states_valid to table_states_validity. The current name made sense when
> it was a bool, but now that it is a tri-state enum, it doesn't fit well.
>
> Thanks for reviewing the patch, the attached v6 patch has the changes for the
> same.

Thanks for the patches.

I saw a recent similar BF error[1] which seems related to the issue that 0001
patch is trying to solve. i.e. The table sync worker is somehow not started
after refreshing the publication on the subscriber. I didn't see other related ERRORs in
the log, so I think the reason is the same as the one being discussed in this
thread, which is the table state invalidation got lost. And the 0001 patch
looks good to me.

For 0002, instead of avoid resetting the latch, is it possible to let the
logical rep worker wake up the launcher once after attaching ?

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=francolin&dt=2024-03-11%2000%3A52%3A42

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-13 05:07:44 Re: A failure in t/038_save_logical_slots_shutdown.pl
Previous Message Craig Ringer 2024-03-13 04:28:55 Re: POC: Extension for adding distributed tracing - pg_tracing