From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Ajin Cherian <itsajin(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Race condition in FetchTableStates() breaks synchronization of subscription tables |
Date: | 2024-03-13 06:28:38 |
Message-ID: | CALDaNm1XeB3bF+VEJZi=BT31PZAL_UVys-26+YSv_AxCq0G2eg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 13 Mar 2024 at 10:12, Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, March 13, 2024 11:49 AMvignesh C <vignesh21(at)gmail(dot)com> wrote:
> > On Tue, 12 Mar 2024 at 09:34, Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > >
> > >
> > >
> > > On Tue, Mar 12, 2024 at 2:59 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >>
> > >>
> > >> Thanks, I have created the following Commitfest entry for this:
> > >> https://commitfest.postgresql.org/47/4816/
> > >>
> > >> Regards,
> > >> Vignesh
> > >
> > >
> > > Thanks for the patch, I have verified that the fix works well by following the
> > steps mentioned to reproduce the problem.
> > > Reviewing the patch, it seems good and is well documented. Just one minor
> > comment I had was probably to change the name of the variable
> > table_states_valid to table_states_validity. The current name made sense when
> > it was a bool, but now that it is a tri-state enum, it doesn't fit well.
> >
> > Thanks for reviewing the patch, the attached v6 patch has the changes for the
> > same.
>
> Thanks for the patches.
>
> I saw a recent similar BF error[1] which seems related to the issue that 0001
> patch is trying to solve. i.e. The table sync worker is somehow not started
> after refreshing the publication on the subscriber. I didn't see other related ERRORs in
> the log, so I think the reason is the same as the one being discussed in this
> thread, which is the table state invalidation got lost. And the 0001 patch
> looks good to me.
>
> For 0002, instead of avoid resetting the latch, is it possible to let the
> logical rep worker wake up the launcher once after attaching ?
Waking up of the launch process uses the same latch that is used for
subscription creation/modification and apply worker process exit. As
the handling of this latch for subscription creation/modification and
worker process exit can be done only by ApplyLauncherMain, we will not
be able to reset the latch in WaitForReplicationWorkerAttach. I feel
waking up the launcher process might not help in this case as
currently we will not be able to differentiate between worker
attached, subscription creation/modification and apply worker process
exit.
Regards,
Vignesh
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2024-03-13 06:31:46 | Re: meson vs tarballs |
Previous Message | Peter Eisentraut | 2024-03-13 06:22:43 | Re: meson vs tarballs |