Re: BUG #18988: DROP SUBSCRIPTION locks not-yet-accessed database

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Álvaro Herrera <alvherre(at)kurilemu(dot)de>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "exclusion(at)gmail(dot)com" <exclusion(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18988: DROP SUBSCRIPTION locks not-yet-accessed database
Date: 2025-08-05 06:21:39
Message-ID: CAFiTN-uWQgNa+4F8D=TEpeib-ouqRiMZTybwGd3wjRNLik5fdQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 4, 2025 at 7:08 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Aug 4, 2025 at 6:20 PM Álvaro Herrera <alvherre(at)kurilemu(dot)de> wrote:
> >
> > On 2025-Aug-04, Dilip Kumar wrote:
> >
> > > I have worked on this and produced a first version of patch, let's see
> > > what others think about this idea. It would have been better if we
> > > could use SysCache for rechecking the subscription, but since we are
> > > not connected to the database in the launcher we can not use the
> > > SysCache, at least that's what I think.
> >
> > I think it's reasonable to recheck after locking. There's a comment in
> > DropSubscription that says we get AEL, which is no longer true.
>
> Right, will remove that.
>
> In
> > is_subscription_exists() you should use the index on OID instead of
> > seqscanning the catalog without a scankey;
>
> I thought since launcher is not connected to the database we will not
> be able to open the index relation. Otherwise we may just call
> SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid)); Maybe this
> is not connected because it was not required so far and we can just
> connect it to template1 ?
>
>
> also I think the name ought
> > to be "does" rather than "is".
>
> Okay
>
> I think it's really odd that that
> > function opens and closes a transaction; sounds to me that something
> > like that really belongs in the caller (frankly the same is true with
> > the other function that your comment references). Why isn't
> > systable_beginscan being used to scan the catalog?
>
> You mean for this function or for get_subscription_list() as well,
> yeah logically systable_beginscan() sounds better.
>
> > I think with this coding, the resource owner for this new lock is NULL.
> > Is this really a good approach? Maybe there should be a resowner here.
>
> As you suggested we should move the transaction to the caller and
> start it before LockSharedObject() so that we will acquire the lock
> under the TopTransactionResourceOwner ?

Here is revised version based on what I proposed here

- I have removed the comment in DropSubscription where we acquire the
lock, as mentioning the ASL is not interesting anymore, instead I am
explaining in launcher why we are acquiring shared object lock.
- Connected launcher to "postgres" database so that we can do syscache lookup
- got rid of "is_subscription_exists" as we are directly validating
using syscache lookup
- Didn't do anything with existing function i.e. get_subscription_list

--
Regards,
Dilip Kumar
Google

Attachment Content-Type Size
v1-0001-Fix-drop-subcription-deadlock-with-create-databas.patch application/octet-stream 4.0 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2025-08-05 12:56:36 Re: BUG #19011: repomd.xml.asc on postgresql 16 empty
Previous Message Sajith Prabhakar Shetty 2025-08-05 05:31:36 Re: Postgres: Queries are too slow after upgrading to PG17 from PG15