Quick Links

Is txid_status() actually safe? / What is 011_crash_recovery.pl testing?

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	pgsql-hackers(at)postgresql(dot)org, Craig Ringer <craig(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Is txid_status() actually safe? / What is 011_crash_recovery.pl testing?
Date:	2021-02-08 21:52:06
Message-ID:	20210208215206.mqmrnpkaqrdtm7fj@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Craig, Robert,

The 011_crash_recovery.pl test test starts a transaction, creates a
table, fetches the transaction's xid. Then shuts down the server in
immediate mode. It then asserts that after crash recovery the previously
assigned xid is shown as aborted, and that new xids are assigned after
the xid.

But as far as I can tell there's no guarantee that that is the case.

It only happens to work because the test - for undocumented reasons -
creates the install with $node->init(allows_streaming => 1), which in
turn restricts shared_buffers to 1MB. Which forces the test to flush WAL
to disk during the CREATE TABLE.

I see failures in the test both when I increase the 1MB or when I change
the buffer replacement logic sufficiently.

E.g.
not ok 2 - new xid after restart is greater

# Failed test 'new xid after restart is greater'
# at t/011_crash_recovery.pl line 61.
# '511'
# >
# '511'
not ok 3 - xid is aborted after crash

Craig, it kind of looks to me like you assumed it'd be guaranteed that
the xid at this point would show in-progress?

I don't think the use of txid_status() described in the docs added in
the commit is actually ever safe?

commit 857ee8e391ff6654ef9dcc5dd8b658d7709d0a3c
Author: Robert Haas <rhaas(at)postgresql(dot)org>
Date: 2017-03-24 12:00:53 -0400

Add a txid_status function.

If your connection to the database server is lost while a COMMIT is
in progress, it may be difficult to figure out whether the COMMIT was
successful or not. This function will tell you, provided that you
don't wait too long to ask. It may be useful in other situations,
too.

Craig Ringer, reviewed by Simon Riggs and by me

Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com

+ <para>
+ <function>txid_status(bigint)</> reports the commit status of a recent
+ transaction. Applications may use it to determine whether a transaction
+ committed or aborted when the application and database server become
+ disconnected while a <literal>COMMIT</literal> is in progress.
+ The status of a transaction will be reported as either
+ <literal>in progress</>,
+ <literal>committed</>, or <literal>aborted</>, provided that the
+ transaction is recent enough that the system retains the commit status
+ of that transaction. If is old enough that no references to that
+ transaction survive in the system and the commit status information has
+ been discarded, this function will return NULL. Note that prepared
+ transactions are reported as <literal>in progress</>; applications must
+ check <link
+ linkend="view-pg-prepared-xacts"><literal>pg_prepared_xacts</></> if they
+ need to determine whether the txid is a prepared transaction.
+ </para>

Until the commit *has completed*, nothing guarantees that anything
bearing the transaction's xid has made it to disk. And we surely don't
want to force a WAL flush when assigning a transaction id, right?

Greetings,

Andres Freund

Responses

Re: Is txid_status() actually safe? / What is 011_crash_recovery.pl testing? at 2021-02-09 20:28:26 from Robert Haas
Re: Is txid_status() actually safe? / What is 011_crash_recovery.pl testing? at 2021-05-05 15:15:53 from Craig Ringer

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	CK Tan	2021-02-08 22:09:51	Clean up code
Previous Message	Mark Dilger	2021-02-08 21:46:50	Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.