Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2020-03-11 10:40:53
Message-ID: CAFiTN-t1sSz+_bO5XkeosG97JXa2ABgUeZ4JrkL03ZPoDjzS9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 10, 2020 at 4:11 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Feb 24, 2020 at 3:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > What I'm advocating is that extension locks should continue to go
> > > through lock.c. And yes, that requires some changes to group locking,
> > > but I still don't see why they'd be complicated.
> > >
> >
> > Fair position, as per initial analysis, I think if we do below three
> > things, it should work out without changing to a new way of locking
> > for relation extension or page type locks.
> > a. As per the discussion above, ensure in code we will never try to
> > acquire another heavy-weight lock after acquiring relation extension
> > or page type locks (probably by having Asserts in code or maybe some
> > other way).
>
> I have done an analysis of the relation extension lock (which can be
> acquired via LockRelationForExtension or
> ConditionalLockRelationForExtension) and found that we don't acquire
> any other heavyweight lock after acquiring it. However, we do
> sometimes try to acquire it again in the places where we update FSM
> after extension, see points (e) and (f) described below. The usage of
> this lock can be broadly divided into six categories and each one is
> explained as follows:
>
> a. Where after taking the relation extension lock we call ReadBuffer
> (or its variant) and then LockBuffer. The LockBuffer internally calls
> either LWLock to acquire or release neither of which acquire another
> heavy-weight lock. It is quite obvious as well that while taking some
> lightweight lock, there is no reason to acquire another heavyweight
> lock on any object. The specs/comments of ReadBufferExtended (which
> gets called from variants of ReadBuffer) API says that if the blknum
> requested is P_NEW, only one backend can call it at-a-time which
> indicates that we don't need to acquire any heavy-weight lock inside
> this API. Otherwise, also, this API won't need a heavy-weight lock to
> read the existing block into shared buffer as two different backends
> are allowed to read the same block. I have also gone through all the
> functions called/used in this path to ensure that we don't use
> heavy-weight locks inside it.
>
> The usage by APIs BloomNewBuffer, GinNewBuffer, gistNewBuffer,
> _bt_getbuf, and SpGistNewBuffer falls in this category. Another API
> that falls under this category is revmap_physical_extend which uses
> ReadBuffer, LocakBuffer and ReleaseBuffer. The ReleaseBuffer API
> unpins aka decrement the reference count for buffer and disassociates
> a buffer from the resource owner. None of that requires heavy-weight
> lock. T
>
> b. After taking relation extension lock, we call
> RelationGetNumberOfBlocks which primarily calls file-level functions
> to determine the size of the file. This doesn't acquire any other
> heavy-weight lock after relation extension lock.
>
> The usage by APIs ginvacuumcleanup, gistvacuumscan, btvacuumscan, and
> spgvacuumscan falls in this category.
>
> c. There is a usage in API brin_page_cleanup() where we just acquire
> and release the relation extension lock to avoid reinitializing the
> page. As there is no call in-between acquire and release, so there is
> no chance of another heavy-weight lock acquire after having relation
> extension lock.
>
> d. In fsm_extend() and vm_extend(), after acquiring relation extension
> lock, we perform various file-level operations like RelationOpenSmgr,
> smgrexists, smgrcreate, smgrnblocks, smgrextend. First, from theory,
> we don't have any heavy-weight lock other than relation extension lock
> which can cover such operations and then I have verified it by going
> through these APIs that these don't acquire any other heavy-weight
> lock. Then these APIs also call PageSetChecksumInplace computes a
> checksum of the page and sets the same in page header which is quite
> straight-forward and doesn't acquire any heavy-weight lock.
>
> In vm_extend, we additionally call CacheInvalidateSmgr to send a
> shared-inval message to force other backends to close any smgr
> references they may have for the relation for which we extending
> visibility map which has no reason to acquire any heavy-weight lock.
> I have checked the code path as well and I didn't find any
> heavy-weight lock call in that.
>
> e. In brin_getinsertbuffer, we call ReadBuffer() and LockBuffer(), the
> usage of which is the same as what is mentioned in (a). In addition
> to that it calls brin_initialize_empty_new_buffer() which further
> calls RecordPageWithFreeSpace which can again acquire relation
> extension lock for same relation. This usage is safe because we have
> a mechanism in heavy-weight lock manager that if we already hold a
> lock and a request came for the same lock and in same mode, the lock
> will be granted.
>
> f. In RelationGetBufferForTuple(), there are multiple APIs that get
> called and like (e), it can try to reacquire the relation extension
> lock in one of those APIs. The main APIs it calls after acquiring
> relation extension lock are described as follows:
> - GetPageWithFreeSpace: This tries to find a page in the given
> relation with at least the specified amount of free space. This
> mainly checks the FSM pages and in one of the paths might call
> fsm_extend which can again try to acquire the relation extension lock
> on the same relation.
> - RelationAddExtraBlocks: This adds multiple pages in a relation if
> there is contention around relation extension lock. This calls
> RelationExtensionLockWaiterCount which is mainly to check how many
> lockers are waiting for the same lock, then call ReadBufferBI which as
> explained above won't require heavy-weight locks and FSM APIs which
> can acquire Relation extension lock on the same relation, but that is
> safe as discussed previously.
>
> The Page locks can be acquired via LockPage and ConditionalLockPage.
> This is acquired from one place in the code during Gin index cleanup
> (ginInsertCleanup). The basic idea is that it will scan the pending
> list and move entries into the main index. While moving entries to
> the main page, it might need to add a new page that will require us to
> take a relation extension lock. Now, unlike relation extension lock,
> after acquiring page lock, we do acquire another heavy-weight lock
> (relation extension lock), but as we never acquire it in reverse
> order, this is safe.
>
> So, as per this analysis, we can add Asserts for relation extension
> and page locks which will indicate that they won't participate in
> deadlocks. It would be good if someone else can also do independent
> analysis and verify my findings.

I have also analyzed the usage for the RelationExtensioLock and the
PageLock. And, my findings are on the same lines.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-03-11 10:56:09 Re: Index Skip Scan
Previous Message Hugh McMaster 2020-03-11 10:37:01 Re: [PATCH] Use PKG_CHECK_MODULES to detect the libxml2 library