Re: Assert while autovacuum was executing

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Assert while autovacuum was executing
Date: 2023-06-23 08:34:15
Message-ID: CAA4eK1JQavtFgHtrS1UzuL4=OZdDZVBemjFCHEiShhVPaaA03w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 22, 2023 at 9:16 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jun 21, 2023 at 10:57 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > As far as I can tell 72e78d831a as-is is just bogus. Unfortunately that likely
> > also means 3ba59ccc89 is not right.
> >
>
> Indeed. I was thinking of a fix but couldn't find one yet. One idea I
> am considering is to allow catalog table locks after page lock but I
> think apart from hacky that also won't work because we still need to
> remove the check added for page locks in the deadlock code path in
> commit 3ba59ccc89 and may need to do something for group locking.
>

I have further thought about this part and I think even if we remove
the changes in commit 72e78d831a (remove the assertion for page locks
in LockAcquireExtended()) and remove the check added for page locks in
FindLockCycleRecurseMember() via commit 3ba59ccc89, it is still okay
to keep the change related to "allow page lock to conflict among
parallel group members" in LockCheckConflicts(). This is because locks
on catalog tables don't conflict among group members. So, we shouldn't
see a deadlock among parallel group members. Let me try to explain
this thought via an example:

Begin;
Lock pg_enum in Access Exclusive mode;
gin_clean_pending_list() -- assume this function is executed by both
leader and parallel worker; also this requires a lock on pg_enum as
shown by Andres in email [1]

Say the parallel worker acquires page lock first and it will also get
lock on pg_enum because of group locking, so, the leader backend will
wait for page lock for the parallel worker. Eventually, the parallel
worker will release the page lock and the leader backend can get the
lock. So, we should be still okay with parallelism.

OTOH, if the above theory is wrong or people are not convinced, I am
okay with removing all the changes in commits 72e78d831a and
3ba59ccc89.

[1] - https://www.postgresql.org/message-id/20230621052713.wc5377dyslxpckfj%40awork3.anarazel.de

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2023-06-23 08:37:09 Re: Allow pg_archivecleanup to remove backup history files
Previous Message Joel Jacobson 2023-06-23 08:23:14 Re: Do we want a hashset type?