Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2017-12-01 18:28:44
Message-ID: CA+TgmoYF-cdQneKUR5Oj=8vwD8GGv6WoCEf=TiPk6nvcKu7V6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 1, 2017 at 10:14 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> The patched is 10 times faster than current HEAD.

Nifty.

The first hunk in monitoring.sgml looks unnecessary.

The second hunk breaks the formatting of the documentation; you need
to adjust the "morerows" value from 9 to 8 here:

<entry morerows="9"><literal>Lock</literal></entry>

And similarly make this one 18:

<entry morerows="17"><literal>IPC</literal></entry>

+* Relation extension locks. The relation extension lock manager is
+specialized in relation extensions. In PostgreSQL 11 relation extension
+lock has been moved out of regular lock. It's similar to regular locks
+but doesn't have full dead lock detection, group locking and multiple
+lock modes. When conflicts occur, lock waits are implemented using
+condition variables.

Higher up, it says that "Postgres uses four types of interprocess
locks", but because you added this, it's now a list of five items.

I suggest moving the section on relation locks to the end and
rewriting the text here as follows: Only one process can extend a
relation at a time; we use a specialized lock manager for this
purpose, which is much simpler than the regular lock manager. It is
similar to the lightweight lock mechanism, but is ever simpler because
there is only one lock mode and only one lock can be taken at a time.
A process holding a relation extension lock is interruptible, unlike a
process holding an LWLock.

+#define RelExtLockTargetTagToIndex(relextlock_tag) \
+ (tag_hash((const void *) relextlock_tag, sizeof(RelExtLockTag)) \
+ % N_RELEXTLOCK_ENTS)

How about using a static inline function for this?

+#define SET_RELEXTLOCK_TAG(locktag, d, r) \
+ ((locktag).dbid = (d), \
+ (locktag).relid = (r))

How about getting rid of this and just doing the assignments instead?

+#define RELEXT_VAL_LOCK ((uint32) ((1 << 25)))
+#define RELEXT_LOCK_MASK ((uint32) ((1 << 25)))

It seems confusing to have two macros for the same value and an
almost-interchangeable purpose. Maybe just call it RELEXT_LOCK_BIT?

+RelationExtensionLockWaiterCount(Relation relation)

Hmm. This is sort of problematic, because with then new design we
have no guarantee that the return value is actually accurate. I don't
think that's a functional problem, but the optics aren't great.

+ if (held_relextlock.nLocks > 0)
+ {
+ RelExtLockRelease(held_relextlock.relid, true);
+ }

Excess braces.

+int
+RelExtLockHoldingLockCount(void)
+{
+ return held_relextlock.nLocks;
+}

Maybe IsAnyRelationExtensionLockHeld(), returning bool?

+ /* If the lock is held by me, no need to wait */

If we already hold the lock, no need to wait.

+ * Luckily if we're trying to acquire the same lock as what we
+ * had held just before, we don't need to get the entry from the
+ * array by hashing.

We're not trying to acquire a lock here. "If the last relation
extension lock we touched is the same one for which we now need to
wait, we can use our cached pointer to the lock instead of recomputing
it."

+ registered_wait_list = true;

Isn't it really registered_wait_count? The only list here is
encapsulated in the CV.

+ /* Before retuning, decrement the wait count if we had been waiting */

returning -> returning, but I'd rewrite this as "Release any wait
count we hold."

+ * Acquire the relation extension lock. If we're trying to acquire the same
+ * lock as what already held, we just increment nLock locally and return
+ * without touching the RelExtLock array.

"Acquire a relation extension lock." I think you can forget the rest
of this; it duplicates comments in the function body.

+ * Since we don't support dead lock detection for relation extension
+ * lock and don't control the order of lock acquisition, it cannot not
+ * happen that trying to take a new lock while holding an another lock.

Since we don't do deadlock detection, caller must not try to take a
new relation extension lock while already holding them.

+ if (relid == held_relextlock.relid)
+ {
+ held_relextlock.nLocks++;
+ return true;
+ }
+ else
+ elog(ERROR,
+ "cannot acquire relation extension locks for
multiple relations at the same");

I'd prefer if (relid != held_relextlock.relid) elog(ERROR, ...) to
save a level of indentation for the rest.

+ * If we're trying to acquire the same lock as what we just released
+ * we don't need to get the entry from the array by hashing. we expect
+ * to happen this case because it's a common case in acquisition of
+ * relation extension locks.

"If the last relation extension lock we touched is the same one for we
now need to acquire, we can use our cached pointer to the lock instead
of recomputing it. This is likely to be a common case in practice."

+ /* Could not got the lock, return iff in conditional locking */

"locking conditionally"

+ ConditionVariableSleep(&(relextlock->cv),
WAIT_EVENT_RELATION_EXTENSION_LOCK);

Break line at comma

+ /* Decrement wait count if we had been waiting */

"Release any wait count we hold."

+ /* Always return true if not conditional lock */

"We got the lock!"

+ /* If force releasing, release all locks we're holding */
+ if (force)
+ held_relextlock.nLocks = 0;
+ else
+ held_relextlock.nLocks--;
+
+ Assert(held_relextlock.nLocks >= 0);
+
+ /* Return if we're still holding the lock even after computation */
+ if (held_relextlock.nLocks > 0)
+ return;

I thought you were going to have the caller adjust nLocks?

+ /* Get RelExtLock entry from the array */
+ SET_RELEXTLOCK_TAG(tag, MyDatabaseId, relid);
+ relextlock = &RelExtLockArray[RelExtLockTargetTagToIndex(&tag)];

This seems to make no sense in RelExtLockRelease -- isn't the cache
guaranteed valid?

+ /* Wake up waiters if there is someone looking at this lock */

"If there may be waiters, wake them up."

+ * We allow to take a relation extension lock after took a
+ * heavy-weight lock. However, since we don't have dead lock
+ * detection mechanism between heavy-weight lock and relation
+ * extension lock it's not allowed taking an another heavy-weight
+ * lock while holding a relation extension lock.

"Relation extension locks don't participate in deadlock detection, so
make sure we don't try to acquire a heavyweight lock while holding
one."

+ /* Release relation extension locks */

"If we hold a relation extension lock, release it."

+/* Number of partitions the shared relation extension lock tables are
divided into */
+#define LOG2_NUM_RELEXTLOCK_PARTITIONS 4
+#define NUM_RELEXTLOCK_PARTITIONS (1 << LOG2_NUM_RELEXTLOCK_PARTITIONS)

Dead code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2017-12-01 18:34:27 Re: Bitmap scan is undercosted?
Previous Message Tom Lane 2017-12-01 18:15:51 Re: Transform for pl/perl