Re: [PATCH] LockAcquireExtended improvement

From: "Jingxian Li" <aqktjcm(at)qq(dot)com>
To: robertmhaas <robertmhaas(at)gmail(dot)com>
Cc: andres <andres(at)anarazel(dot)de>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] LockAcquireExtended improvement
Date: 2024-02-01 07:16:35
Message-ID: tencent_1BB753C5F9D03D240E63EE8F4C639829E009@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Robert,

Thank you for your advice. It is very helpful to me.

On 2024/1/16 3:07, Robert Haas wrote:
> Hello Jingxian Li!
>
> I agree with you that this behavior seems surprising. I don't think
> it's quite a bug, more of a limitation. However, I think it would be
> nice to fix it if we can find a good way to do that.
>
> On Wed, Nov 29, 2023 at 10:43 PM Jingxian Li <aqktjcm(at)qq(dot)com> wrote:
>> Transaction A already holds an n-mode lock on table test,
>> that is, there is no locks held conflicting with the n-mode lock on table test,
>> If then transaction A requests an m-mode lock on table test,
>> as n's confilctTab covers m, it can be concluded that
>> there are no locks conflicting with the requested m-mode lock.
> This algorithm seems correct to me, but I think Andres is right to be
> concerned about overhead. You're proposing to inject a call to
> CheckLocalLockConflictTabCover() into the main code path of
> LockAcquireExtended(), so practically every lock acquisition will pay
> the cost of that function. And that function is not particularly
> cheap: every call to LockHeldByMe is a hash table lookup. That sounds
> pretty painful. If we could incur the overhead of this only when we
> knew for certain that we would otherwise have to fail, that would be
> more palatable, but doing it on every non-fastpath heavyweight lock
> acquisition seems way too expensive.
>
> Even aside from overhead, the approach the patch takes doesn't seem
> quite right to me. As you noted, ProcSleep() has logic to jump the
> queue if adding ourselves at the end would inevitably result in
> deadlock, which is why your test case doesn't need to wait until
> deadlock_timeout for the lock acquisition to succeed. But because that
> logic happens in ProcSleep(), it's not reached in the NOWAIT case,
> which means that it doesn't help any more once NOWAIT is specified. I
> think that the right way to fix the problem would be to reach that
> check even in the NOWAIT case, which could be done either by hoisting
> some of the logic in ProcSleep() up into LockAcquireExtended(), or by
> pushing the nowait flag down into ProcSleep() so that we can fail only
> if we're definitely going to sleep. The former seems more elegant in
> theory, but the latter looks easier to implement, at least at first
> glance.
According to what you said, I resubmitted a patch which splits the ProcSleep
logic into two parts, the former is responsible for inserting self to
WaitQueue,
the latter is responsible for deadlock detection and processing, and the
former part is directly called by LockAcquireExtended before nowait fails.
In this way the nowait case can also benefit from adjusting the insertion
order of WaitQueue.
>
> But the patch as proposed instead invents a new way of making the test
> case work, not leveraging the existing logic and, I suspect, not
> matching the behavior in all cases.
>
> I also agree with Vignesh that a test case would be a good idea. It
> would need to be an isolation test, since the regular regression
> tester isn't powerful enough for this (at least, I don't see how to
> make it work).
>
A test case was also added in the dir src/test/isolation.

Jingxian Li

Attachment Content-Type Size
v2-0001-LockAcquireExtended-improvement.patch application/octet-stream 15.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2024-02-01 07:17:27 Re: Add system identifier to backup manifest
Previous Message Nikolay Shaplov 2024-02-01 06:58:32 Re: [PATCH] New [relation] option engine