| From: | Oleksii Kliukin <alexk(at)hintbits(dot)com> | 
|---|---|
| To: | Álvaro Herrera <alvherre(at)2ndquadrant(dot)com> | 
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: upgrades in row-level locks can deadlock | 
| Date: | 2019-06-12 15:14:13 | 
| Message-ID: | E621393D-A213-41B0-8C9D-584F8EC9287F@hintbits.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Oleksii Kliukin <alexk(at)hintbits(dot)com> wrote:
> Hi,
> 
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> 
>> On 2019-May-22, Oleksii Kliukin wrote:
>> 
>>> X1: select id from job where name = 'a' for key share;
>>> Y: select id from job where name = 'a' for update; -- starts waiting for X1
>>> X2: select id from job where name = 'a' for key share;
>>> X1: update job set name = 'b' where id = 1;
>>> X2: update job set name = 'c' where id = 1; -- starts waiting for X1
>>> X1: rollback;
>>> 
>>> At this point, Y is terminated by the deadlock detector:
>> 
>> Yeah, this seems undesirable in general terms.  Here's a quick
>> isolationtester spec that reproduces the problem:
>> 
>> setup {
>> 	drop table if exists tlu_job;
>> 	create table tlu_job (id integer primary key, name text);
>> 	insert into tlu_job values (1, 'a');
>> }
>> 
>> teardown {
>> 	drop table tlu_job;
>> }
>> 
>> session "s1"
>> setup				{ begin; }
>> step "s1_keyshare"	{ select id from tlu_job where name = 'a' for key share; }
>> step "s1_update"	{ update tlu_job set name = 'b' where id = 1; }
>> step "s1_rollback"	{ rollback; }
>> 
>> session "s2"
>> setup				{ begin; }
>> step "s2_keyshare"	{ select id from tlu_job where name = 'a' for key share; }
>> step "s2_update"	{ update tlu_job set name = 'c' where id = 1; }
>> step "s2_commit"	{ commit; }
>> 
>> session "s3"
>> setup				{ begin; }
>> step "s3_forupd"	{ select id from tlu_job where name = 'a' for update; }
>> teardown			{ commit; }
> 
> Thank you! I can make it even simpler;  s1 just acquires for share lock, s3
> gets for update one and s2 takes for share lock first, and then tries to
> acquire for update one; once s1 finishes, s3 deadlocks.
> 
>> But semantically, I wonder if your transactions are correct.  If you
>> intend to modify the row in s1 and s2, shouldn't you be acquiring FOR NO
>> KEY UPDATE lock instead?  I don't see how can s1 and s2 coexist
>> peacefully.  Also, can your Y transaction use FOR NO KEY UPDATE instead
>> .. unless you intend to delete the tuple in that transaction?
> 
> It is correct. I wanted to make sure jobs that acquire for key share lock
> can run concurrently most of the time; they only execute one update at the
> end of the job, and it is just to update the last run timestamp.
> 
>> I'm mulling over your proposed fix.  I don't much like the idea that
>> DoesMultiXactIdConflict() returns that new boolean -- seems pretty
>> ad-hoc -- but I don't see any way to do better than that ...  (If we get
>> down to details, DoesMultiXactIdConflict needn't initialize that
>> boolean: better let the callers do.)
> 
> I am also not happy about the new parameter to DoesMultiXactIdConflict, but
> calling a separate function to fetch the presence of the current transaction
> in the multixact would mean doing the job of DoesMultiXactIdConflict twice.
> I am open to suggestions on how to make it nicer.
> 
> Attached is a slightly modified patch that avoids initializing
> has_current_xid inside DoesMultiXactIdConflict and should apply cleanly to
> the current master.
And here is the v3 that also includes the isolation test I described above
(quoting my previous message in full as I accidentally sent it off-list,
sorry about that).
Cheers,
Oleksii
| Attachment | Content-Type | Size | 
|---|---|---|
| deadlock_on_row_lock_upgrades_v3.diff | application/octet-stream | 8.4 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrey Borodin | 2019-06-12 15:14:34 | Re: GiST limits on contrib/cube with dimension > 100? | 
| Previous Message | Tom Lane | 2019-06-12 15:06:47 | Re: Are there still non-ELF BSD systems? |