RE: The order of queues in row lock is changed (not FIFO)

From: "Ryo Yamaji (Fujitsu)" <yamaji(dot)ryo(at)fujitsu(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: The order of queues in row lock is changed (not FIFO)
Date: 2023-03-07 01:48:21
Message-ID: TYAPR01MB60739D02E0BD37B54FE85A1D8AB79@TYAPR01MB6073.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> I don't see a bug here, or at least I'm not willing to move the goalposts to where you want them to be.
> I believe that we do guarantee arrival-order locking of individual tuple versions. However, in the
> example you show, a single row is being updated over and over. So, initially we have a single "winner"
> transaction that got the tuple lock first and updated the row. When it commits, each other transaction
> serially comes off the wait queue for that tuple lock and discovers that it now needs a lock on a
> different tuple version than it has got.
> So it tries to get lock on whichever is the latest tuple version.
> That might still appear serial as far as the original 100 sessions go, because they were all queued on the
> same tuple lock to start with.
> But when the new sessions come in, they effectively line-jump because they will initially try to lock
> whichever tuple version is committed live at that instant, and thus they get ahead of whichever remain of
> the original 100 sessions for the lock on that tuple version (since those are all still blocked on some older
> tuple version, whose lock is held by whichever session is performing the next-to-commit update).

> I don't see any way to make that more stable that doesn't involve requiring sessions to take locks on
> already-dead-to-them tuples; which sure seems like a nonstarter, not least because we don't even have a
> way to find such tuples. The update chains only link forward not back.

Thank you for your reply.
When I was doing this test, I confirmed the following two actions.
(1) The first 100 sessions are overtaken by the last 10.
(2) the order of the preceding 100 sessions changes

(1) I was concerned from the user's point of view that the lock order for the same tuple was not preserved.
However, as you pointed out, in many cases the order of arrival is guaranteed from the perspective of the tuple.
You understand the PostgreSQL architecture and understand that you need to use it.

(2) This behavior is rare. Typically, the first session gets AccessExclusiveLock to the tuple and ShareLock to the
transaction ID. Subsequent sessions will wait for AccessExclusiveLock to the tuple. However, we ignored
AccessExclusiveLock in the tuple from the log and observed multiple sessions waiting for ShareLock to the
transaction ID. The log shows that the order of the original 100 sessions has been changed due to the above
movement.

At first, I thought both (1) and (2) were obstacles. However, I understood from your indication that (1) is not a bug.
I would be grateful if you could also give me your opinion on (2).

Share the following logs:

[Log]
1. ShareLock has one wait, the rest is in AccessExclusiveLock

1-1. Only 1369555 is aligned with ShareLock, the transaction ID obtained by 1369547, and the rest with
  AccessExclusiveLock, the tuple obtained by 1369555.
  This is similar to a pattern in which no updates have occurred to the tuple.
--------------------------------------------------------------
2022-10-26 01:20:08.881 EDT [1369555:19:0] LOG: process 1369555 still waiting for ShareLock on transaction 2501 after 10.072 ms
2022-10-26 01:20:08.881 EDT [1369555:20:0] DETAIL: Process holding the lock: 1369547. Wait queue: 1369555.

2022-10-26 01:21:58.918 EDT [1369898:17:0] LOG: process 1369898 acquired AccessExclusiveLock on tuple (1, 0) of relation 16546 of database 13779 after 10.321 ms
2022-10-26 01:21:58.918 EDT [1369898:18:0] DETAIL: Process holding the lock: 1369555. Wait queue: 1369558, 1369561, 1369564, 1369567, 1369570, 1369573, 1369576, ...
--------------------------------------------------------------

2. All processes wait with ShareLock

2-1. With 1369558 holding the t1 (0, 4) lock, the queue head is 1369561.
--------------------------------------------------------------
2022-10-26 01:22:27.230 EDT [1369623:46:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2504 after 10.133 msprocess 1369623 still waiting for ShareLock on transaction 2504 after 10.133 ms
2022-10-26 01:22:27.242 EDT [1369877:47:2604] DETAIL: Process holding the lock: 1369558. Wait queue: 1369561, 1369623, 1369626, ...
--------------------------------------------------------------

2-2. When 1369558 locks are released, the first 1369561 in the Wait queue was expected to acquire the lock,
  but the process actually acquired 1369787
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369623:63:2525] LOG: process 1369623 still waiting for ShareLock on transaction 2577 after 10.028 ms
2022-10-26 01:22:28.237 EDT [1369623:64:2525] DETAIL: Process holding the lock: 1369787. Wait queue: 1369623, 1369610, 1369614, 1369617, 1369620.
--------------------------------------------------------------

2-3. Checking that the 1369561 is rearranging.
--------------------------------------------------------------
2022-10-26 01:22:28.237 EDT [1369629:64:2527] DETAIL: Process holding the lock: 1369623. Wait queue: 1369629, 1369821, 1369644, ... 1369561, ...
--------------------------------------------------------------

Regards, ryo

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-03-07 01:50:29 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Previous Message Alexander Korotkov 2023-03-07 01:45:32 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()