Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2020-03-05 08:24:12
Message-ID: CAFiTN-sYhNOC1Bk7y3YZOG72QgJTcu-+VqMfs=1mwKVpAn=yDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 5, 2020 at 12:15 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Mar 4, 2020 at 11:45 AM Mahendra Singh Thalor
> <mahi6run(at)gmail(dot)com> wrote:
> >
> > On Mon, 24 Feb 2020 at 15:39, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > >
> > > > What I'm advocating is that extension locks should continue to go
> > > > through lock.c. And yes, that requires some changes to group locking,
> > > > but I still don't see why they'd be complicated.
> > > >
> > >
> > > Fair position, as per initial analysis, I think if we do below three
> > > things, it should work out without changing to a new way of locking
> > > for relation extension or page type locks.
> > > a. As per the discussion above, ensure in code we will never try to
> > > acquire another heavy-weight lock after acquiring relation extension
> > > or page type locks (probably by having Asserts in code or maybe some
> > > other way).
> > > b. Change lock.c so that group locking is not considered for these two
> > > lock types. For ex. in LockCheckConflicts, along with the check (if
> > > (proclock->groupLeader == MyProc && MyProc->lockGroupLeader == NULL)),
> > > we also check lock->tag and call it a conflict for these two locks.
> > > c. The deadlock detector can ignore checking these two types of locks
> > > because point (a) ensures that those won't lead to deadlock. One idea
> > > could be that FindLockCycleRecurseMember just ignores these two types
> > > of locks by checking the lock tag.
> >
> > Thanks Amit for summary.
> >
> > Based on above 3 points, here attaching 2 patches for review.
> >
> > 1. v01_0001-Conflict-EXTENTION-lock-in-group-member.patch (Patch by Dilip Kumar)
> > Basically this patch is for point b and c.
> >
> > 2. v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any.patch
> > (Patch by me)
> > This patch is for point a.
> >
> > After applying both the patches, make check-world is passing.
> >
> > We are testing both the patches and will post results.
> >
>
> I think we need to do detailed code review in the places where we are
> taking Relation Extension Lock and see whether we are acquiring
> another heavy-weight lock after that. It seems to me that in
> brin_getinsertbuffer, after acquiring Relation Extension Lock, we
> might again try to acquire the same lock. See
> brin_initialize_empty_new_buffer which is called after acquiring
> Relation Extension Lock, in that function, we call
> RecordPageWithFreeSpace and that can again try to acquire the same
> lock if it needs to perform fsm_extend. I think there will be similar
> instances in the code. I think it is fine if we again try to acquire
> it, but the current assertion in your patch needs to be adjusted for
> that.
>
> Few other minor comments on
> v01_0002-Added-assert-to-verify-that-we-never-try-to-take-any:
> 1. Ideally, this should be the first patch as we first need to ensure
> that we don't take any heavy-weight locks after acquiring a relation
> extension lock.
>
> 2. I think it is better to add an Assert after initial error checks
> (after RecoveryInProgress().. check)
>
> 3.
> + Assert (locallock->tag.lock.locktag_type != LOCKTAG_RELATION_EXTEND ||
> + locallock->nLocks == 0);
>
> I think it is not possible that we have an entry in
> LockMethodLocalHash and its value is zero. Do you see any such
> possibility, if not, then we might want to remove it?
>
> 4. We already have a macro for LOCALLOCK_LOCKMETHOD, can we write
> another one tag type? This will make the check look a bit cleaner and
> probably if we need to extend it in future for Page type locks, then
> also it will be good.
>
> 5. I have also tried to think of another way to check if we already
> hold lock type LOCKTAG_RELATION_EXTEND, but couldn't come up with a
> cheaper way than this. Basically, I think if we traverse the
> MyProc->myProcLocks queue, we will get this information, but that
> doesn't seem much cheaper than this.

I think we can maintain a flag (rel_extlock_held). And, we can set
that true in LockRelationForExtension,
ConditionalLockRelationForExtension functions and we can reset it in
UnlockRelationForExtension or in the error path e.g. LockReleaseAll.
I think, this way we will be able to elog and this will be much
cheaper.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-03-05 08:40:07 Re: PG_COLOR not mentioned in docs of vacuumlo, oid2name and pgbench
Previous Message Masahiko Sawada 2020-03-05 07:58:47 Re: Some problems of recovery conflict wait events