Re: The order of queues in row lock is changed (not FIFO)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Ryo Yamaji (Fujitsu)" <yamaji(dot)ryo(at)fujitsu(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The order of queues in row lock is changed (not FIFO)
Date: 2023-04-22 10:29:38
Message-ID: CAA4eK1L5oeBsCkeUe--XREBpt0ZhLH31ak-5b=M53QFLfirGXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 7, 2023 at 4:49 PM Ryo Yamaji (Fujitsu)
<yamaji(dot)ryo(at)fujitsu(dot)com> wrote:
>
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> > I don't see a bug here, or at least I'm not willing to move the goalposts to where you want them to be.
> > I believe that we do guarantee arrival-order locking of individual tuple versions. However, in the
> > example you show, a single row is being updated over and over. So, initially we have a single "winner"
> > transaction that got the tuple lock first and updated the row. When it commits, each other transaction
> > serially comes off the wait queue for that tuple lock and discovers that it now needs a lock on a
> > different tuple version than it has got.
> > So it tries to get lock on whichever is the latest tuple version.
> > That might still appear serial as far as the original 100 sessions go, because they were all queued on the
> > same tuple lock to start with.
> > But when the new sessions come in, they effectively line-jump because they will initially try to lock
> > whichever tuple version is committed live at that instant, and thus they get ahead of whichever remain of
> > the original 100 sessions for the lock on that tuple version (since those are all still blocked on some older
> > tuple version, whose lock is held by whichever session is performing the next-to-commit update).
>
> > I don't see any way to make that more stable that doesn't involve requiring sessions to take locks on
> > already-dead-to-them tuples; which sure seems like a nonstarter, not least because we don't even have a
> > way to find such tuples. The update chains only link forward not back.
>
> Thank you for your reply.
> When I was doing this test, I confirmed the following two actions.
> (1) The first 100 sessions are overtaken by the last 10.
> (2) the order of the preceding 100 sessions changes
>
> (1) I was concerned from the user's point of view that the lock order for the same tuple was not preserved.
> However, as you pointed out, in many cases the order of arrival is guaranteed from the perspective of the tuple.
> You understand the PostgreSQL architecture and understand that you need to use it.
>
> (2) This behavior is rare. Typically, the first session gets AccessExclusiveLock to the tuple and ShareLock to the
> transaction ID. Subsequent sessions will wait for AccessExclusiveLock to the tuple. However, we ignored
> AccessExclusiveLock in the tuple from the log and observed multiple sessions waiting for ShareLock to the
> transaction ID. The log shows that the order of the original 100 sessions has been changed due to the above
> movement.
>

I think for (2), the test is hitting the case of walking the update
chain via heap_lock_updated_tuple() where we don't acquire the lock on
the tuple. See comments atop heap_lock_updated_tuple(). You can verify
if that is the case by adding some DEBUG logs in that function.

> At first, I thought both (1) and (2) were obstacles. However, I understood from your indication that (1) is not a bug.
> I would be grateful if you could also give me your opinion on (2).
>

If my above observation is correct then it is not a bug as it is
behaving as per the current design.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-04-22 11:06:02 Re: Logging parallel worker draught
Previous Message Aleksander Alekseev 2023-04-22 10:21:46 Re: Mistake in freespace/README?