Re: Race condition in TransactionIdIsInProgress

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race condition in TransactionIdIsInProgress
Date: 2022-02-12 03:42:31
Message-ID: 20220212034231.6a564cwzwsbapu46@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-11 16:41:24 -0800, Andres Freund wrote:
> FWIW, I've indeed reproduced this fairly easily with such a setup. A pgbench
> r/w workload that's been modified to start 70 savepoints at the start shows
>
> pgbench: error: client 22 script 0 aborted in command 12 query 0: ERROR: t_xmin 3853739 is uncommitted in tuple (2,159) to be updated in table "pgbench_branches"
> pgbench: error: client 13 script 0 aborted in command 12 query 0: ERROR: t_xmin 3954305 is uncommitted in tuple (2,58) to be updated in table "pgbench_branches"
> pgbench: error: client 7 script 0 aborted in command 12 query 0: ERROR: t_xmin 4017908 is uncommitted in tuple (3,44) to be updated in table "pgbench_branches"
>
> after a few minutes of running with a local, not slowed down, syncrep. Without
> any other artifical slowdowns or such.

And this can easily be triggered even without subtransactions, in a completely
reliable way.

The only reason I'm so far not succeeding in turning it into an
isolationtester spec is that a transaction waiting for SyncRep doesn't count
as waiting for isolationtester.

Basically

S1: BEGIN; $xid = txid_current(); UPDATE; COMMIT; <commit wait for syncrep>
S2: SELECT pg_xact_status($xid);
S2: UPDATE;

suffices, because the pg_xact_status() causes an xlog fetch, priming the xid
cache, which then causes the TransactionIdIsInProgress() to take the early
return path, despite the transaction still being in progress. Which then
allows the update to proceed, despite the S1 not having "properly committed"
yet.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-02-12 03:52:40 Re: Teach pg_receivewal to use lz4 compression
Previous Message Julien Rouhaud 2022-02-12 02:06:03 Re: Replacing TAP test planning with done_testing()