Re: upgrades in row-level locks can deadlock

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Oleksii Kliukin <alexk(at)hintbits(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Chris Travers <chris(dot)travers(at)adjust(dot)com>
Subject: Re: upgrades in row-level locks can deadlock
Date: 2019-06-11 20:40:25
Message-ID: 20190611204025.GA10702@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

On 2019-May-22, Oleksii Kliukin wrote:

> I have recently observed a deadlock on one of our production servers related
> to locking only a single row in a job table. There were two functions involved
> in the deadlock, the first one acquires a “for key share” lock on the row that
> represents the job it works on and subsequently updates it with the job’s end
> time (we need multiple jobs to be operating on a single row concurrently,
> that’s why there is a "for key share" lock). The other function starts by
> acquiring the “for update” lock on the job row and then performs actions that
> should not be run in parallel with other jobs.
>
> The deadlock can be easily reproduced with the following statements. The
> queries run against a table job (id integer primary key, name text) with a
> single row of (1,'a'))

Hmm, great find.

> X1: select id from job where name = 'a' for key share;
> Y: select id from job where name = 'a' for update; -- starts waiting for X1
> X2: select id from job where name = 'a' for key share;
> X1: update job set name = 'b' where id = 1;
> X2: update job set name = 'c' where id = 1; -- starts waiting for X1
> X1: rollback;
>
> At this point, Y is terminated by the deadlock detector:

Yeah, this seems undesirable in general terms. Here's a quick
isolationtester spec that reproduces the problem:

setup {
drop table if exists tlu_job;
create table tlu_job (id integer primary key, name text);
insert into tlu_job values (1, 'a');
}

teardown {
drop table tlu_job;
}

session "s1"
setup { begin; }
step "s1_keyshare" { select id from tlu_job where name = 'a' for key share; }
step "s1_update" { update tlu_job set name = 'b' where id = 1; }
step "s1_rollback" { rollback; }

session "s2"
setup { begin; }
step "s2_keyshare" { select id from tlu_job where name = 'a' for key share; }
step "s2_update" { update tlu_job set name = 'c' where id = 1; }
step "s2_commit" { commit; }

session "s3"
setup { begin; }
step "s3_forupd" { select id from tlu_job where name = 'a' for update; }
teardown { commit; }

# Alexey's permutation
permutation "s1_keyshare" "s3_forupd" "s2_keyshare" "s1_update" "s2_update" "s1_rollback" "s2_commit"

(X1 is s1, X2 is s2, Y is s3).

Permutations such as that one report a deadlock with the original code,
and does not report a deadlock after your proposed patch.

permutation "s1_keyshare" "s1_update" "s2_keyshare" "s3_forupd" "s2_update" "s1_rollback" "s2_commit"

But semantically, I wonder if your transactions are correct. If you
intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
KEY UPDATE lock instead? I don't see how can s1 and s2 coexist
peacefully. Also, can your Y transaction use FOR NO KEY UPDATE instead
.. unless you intend to delete the tuple in that transaction?

I'm mulling over your proposed fix. I don't much like the idea that
DoesMultiXactIdConflict() returns that new boolean -- seems pretty
ad-hoc -- but I don't see any way to do better than that ... (If we get
down to details, DoesMultiXactIdConflict needn't initialize that
boolean: better let the callers do.)

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-06-11 20:41:39 Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Previous Message David Rowley 2019-06-11 20:34:57 Re: using index or check in ALTER TABLE SET NOT NULL