Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race condition in FetchTableStates() breaks synchronization of subscription tables
Date: 2024-02-06 13:00:00
Message-ID: a50d1da6-e3fb-1f06-0ae4-8e038d1c4d85@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

05.02.2024 13:13, vignesh C wrote:
> Thanks for the steps for the issue, I was able to reproduce this issue
> in my environment with the steps provided. The attached patch has a
> proposed fix where the latch will not be set in case of the apply
> worker exiting immediately after starting.

It looks like the proposed fix doesn't help when ApplyLauncherWakeup()
called by a backend executing CREATE SUBSCRIPTION command.
That is, with the v4-0002 patch applied and pg_usleep(300000L); added
just below
            if (!worker_in_use)
                return worker_in_use;
I still observe the test 027_nosuperuser running for 3+ minutes:
t/027_nosuperuser.pl .. ok
All tests successful.
Files=1, Tests=19, 187 wallclock secs ( 0.01 usr  0.00 sys +  4.82 cusr  4.47 csys =  9.30 CPU)

IIUC, it's because a launcher wakeup call, sent by "CREATE SUBSCRIPTION
regression_sub ...", gets missed when launcher waits for start of another
worker (logical replication worker for subscription "admin_sub"), launched
just before that command.

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-02-06 13:07:20 Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
Previous Message Peter Eisentraut 2024-02-06 12:59:18 clarify equalTupleDescs()