From: | Achilleas Mantzios <a(dot)mantzios(at)cloud(dot)gatewaynet(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Cc: | Achilleas Mantzios <itdev(at)gatewaynet(dot)com> |
Subject: | Re: Strange deadlock with object/target of lock : transaction |
Date: | 2025-08-26 07:38:30 |
Message-ID: | aee6efad-aded-48cb-91e3-b6b502910c6e@cloud.gatewaynet.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 8/25/25 17:11, Achilleas Mantzios wrote:
>
> On 25/8/25 17:58, Adrian Klaver wrote:
>> On 8/25/25 07:40, Achilleas Mantzios wrote:
>>> On 8/20/25 14:59, Achilleas Mantzios wrote:
>>>
>>>> On 8/14/25 16:01, Achilleas Mantzios wrote:
>>>>
>>>>> Hi Adrian
>>>>>
>>>>> On 8/14/25 15:39, Adrian Klaver wrote:
>>>>>
>>>>>> On 8/14/25 00:07, Achilleas Mantzios wrote:
>>>>>>> Hi All
>>>>>>>
>>>>>>> We've been hit by a weird deadlock which it took me some days to
>>>>>>> isolate and replicate. It does not have to do with order of
>>>>>>> updates or any explicit TABLE-level locking, the objects/targets
>>>>>>> of the deadlock in question are transactions.
>>>>>>
>>>>> First off, I maybe wrong with the above conclusion, I noticed that
>>
>>> Hi I reproduced without the triggers, I understood the problem, I
>>> believe the system's behavior is the intended, I am sorry for the
>>> false alarm. The thing is that it takes >=3 transactions to happen .
>>> That was the tricky part, up to now in all cases of deadlocks we had
>>> two transactions involved, this one needed three or more.
>>
>> For folks that run across this thread what was the issue?
> Inconsistent order of updates. The two pieces of code , the update
> piece and the insert piece, used inconsistent order of updates.
> However this could not be manifested with one xaction of the
> update-type and one of the insert-type, there had to be more than one
> transactions of the update-type doing the same update (usually caused
> by users hitting the reload button after 1 or 2 seconds). I can easily
> prepare a test case, schema, data, commands for anyone interested.
Hi, trying to create a test case, I think I am up to something. I attach
the test table and data.
How to reproduce. We have the "insert" code and the "update" code. The
"update" code runs :
with tempqry as (select year,doc_no from test_deadlock where
delivered_at::date = current_date
and not sign_list order by
bank_name,management_company_name,beneficiary_name,year,doc_no )
UPDATE test_deadlock pf set sign_list = true FROM tempqry
WHERE pf.year = tempqry.year and pf.doc_no = tempqry.doc_no ;
The insert code runs :
select COALESCE(max(id),0)+1 FROM test_deadlock;
INSERT INTO test_deadlock (id,year, doc_no, inserted_at, bank_name,
management_company_name,beneficiary_name, currency, delivered_at,
sign_list) VALUES (333024,2025, 395303, '2025-08-26 01:00:00+03',
'CRRRRRRRRRRR', '0006-D', 'AUTAUTAUTAUTAUTAUTAUTAUTAUTAUT', 'EUR',
'2025-08-26 11:55:28.359485+03', false);
DO $$
DECLARE REC record;
BEGIN
FOR REC IN SELECT year,doc_no FROM test_deadlock WHERE NOT
sign_list ORDER BY
bank_name,management_company_name,beneficiary_name,year,doc_no LOOP
UPDATE test_deadlock SET currency=currency WHERE year=REC.year
AND doc_no=REC.doc_no;
END LOOP;
END$$
;
As you can notice, the order of the updates in both the "update" code
and the "insert" code is consistent (same order by
bank_name,management_company_name,beneficiary_name,year,doc_no) and one
would expect there would no deadlock .
Now I open 3 sessions , 1, 2 and 3 :
Session 1, running the "update" code :
amantzio(at)[local]/dynacom=# begin; select txid_current(),
pg_backend_pid() ; with tempqry as (select year,doc_no from
test_deadlock where delivered_at::date = current_date and not sign_lis
t order by
bank_name,management_company_name,beneficiary_name,year,doc_no ) UPDATE
test_deadlock pf set sign_list = true FROM tempqry WHERE pf.year =
tempqry.year and pf.doc_no = tempqry.d
oc_no ;
BEGIN
txid_current | pg_backend_pid
--------------+----------------
117290416 | 2336
(1 row)
UPDATE 6
amantzio(at)[local]/dynacom=*#
Session 2, like session 1, also "update" :
amantzio(at)[local]/dynacom=# begin; select txid_current(),
pg_backend_pid() ; with tempqry as (select year,doc_no from
test_deadlock where delivered_at::date = current_date and not sign_lis
t order by
bank_name,management_company_name,beneficiary_name,year,doc_no ) UPDATE
test_deadlock pf set sign_list = true FROM tempqry WHERE pf.year =
tempqry.year and pf.doc_no = tempqry.d
oc_no ;
BEGIN
txid_current | pg_backend_pid
--------------+----------------
117290418 | 2056
(1 row)
(waits for session 1's xaction to finish , as expected )
Session 3, running the "insert" code :
begin ; select txid_current() , pg_backend_pid() ; select
COALESCE(max(id),0)+1 FROM test_deadlock; INSERT INTO test_deadlock
(id,year, doc_no, inserted_at, ba
nk_name, management_company_name,beneficiary_name, currency,
delivered_at, sign_list) VALUES (333024,2025, 395303, '2025-08-26
01:00:00+03', 'CRRRRRRRRRRR', '0006-D', 'AUTAUTAUTAUTAUTAUTAU
TAUTAUTAUT', 'EUR', '2025-08-26 11:55:28.359485+03', false); DO
$$DECLARE REC record; BEGIN
FOR REC IN SELECT year,doc_no FROM test_deadlock WHERE NOT sign_list
ORDER BY bank_name,management_company_name,beneficiary_name,year,doc_no
LOOP
UPDATE test_deadlock SET currency=currency WHERE year=REC.year AND
doc_no=REC.doc_no;
END LOOP;
END$$
;
BEGIN
txid_current | pg_backend_pid
--------------+----------------
117290419 | 2039
(1 row)
?column?
----------
333023
(1 row)
INSERT 0 1
(waits ....)
Now I return to session 1 and give :
amantzio(at)[local]/dynacom=*# rollback ;
ROLLBACK
amantzio(at)[local]/dynacom=#
Immediately after , session 3 executes the anonymous block, but session 2 :
ERROR: deadlock detected
DETAIL: Process 2056 waits for ShareLock on transaction 117290419;
blocked by process 2039.
Process 2039 waits for ShareLock on transaction 117290418; blocked by
process 2056.
HINT: See server log for query details.
CONTEXT: while updating tuple (8,33) in relation "test_deadlock"
amantzio(at)[local]/dynacom=!#
If I update the "insert" code to use a CTE instead of the anonymous
block in session 3, while Session 1 and 2 commands ("update" code)
remain unchanged ,
session 1 (re-run as above)
session 2 (re-run as above)
session 3 :
postgres(at)[local]/dynacom=# begin ; select txid_current() ,
pg_backend_pid() ; select COALESCE(max(id),0)+1 FROM test_deadlock;
INSERT INTO test_deadlock (id,year, doc_no, inserted_at, ba
nk_name, management_company_name,beneficiary_name, currency,
delivered_at, sign_list) VALUES (333024,2025, 395303, '2025-08-26
01:00:00+03', 'CREDIT SUISSE AG', '0006-DYNACOM TANKERS MANAG
EMENT LTD', 'AUTUMN SHIPPING SERVICES LIMITED', 'EUR', '2025-08-26
11:55:28.359485+03', false); with tempqrybig as (select year,doc_no from
test_deadlock where not sign_list order by bank_
name,management_company_name,beneficiary_name,year,doc_no) UPDATE
test_deadlock pf set currency=currency FROM tempqrybig WHERE pf.year =
tempqrybig.year and pf.doc_no = tempqrybig.doc_no ;
BEGIN
txid_current | pg_backend_pid
--------------+----------------
117290454 | 2039
(1 row)
?column?
----------
333023
(1 row)
INSERT 0 1
(waits)
then I go to Session 1, give rollback, session 2 completes, then I
rollback session 2, session 3 completes and finally I rollback session
3, without any deadlock.
Another finding , if I keep the anonymous block , but add a primary key
on (year, doc_no)
alter table test_deadlock ADD CONSTRAINT test_deadlock_pk PRIMARY KEY
(year, doc_no);
and repeat the initial version of the 3 sessions, again the deadlock
goes away.
All the above with 18beta3 .
So, I am not so sure this should happen, since there is no obvious
inconsistent order of updates. You may reproduce or ask me to run again
the tests while observing pg_locks or anything else.
Thank you.
>
>
Attachment | Content-Type | Size |
---|---|---|
test_deadlock_dump.sql | application/sql | 25.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | xx Z | 2025-08-26 11:48:44 | How to configure client-side TLS ciphers for streaming replication? |
Previous Message | xx Z | 2025-08-26 07:27:40 | Feature request: A method to configure client-side TLS ciphers for streaming replication |