Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId
Date: 2016-11-24 01:04:39
Message-ID: CAMsr+YEDXVMeD7vxtcs-HUTUB29eq=BAX+6ZHJTO7_tpvkjUuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 November 2016 at 02:32, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> On 2016-11-23 20:58:22 +0800, Craig Ringer wrote:
>> Today I ran into an issue where commit timestamp lookups were failing with
>>
>> ERROR: cannot retrieve commit timestamp for transaction 2
>>
>> which is of course FrozenTransactionId.
>>
>> TransactionIdGetCommitTsData(...) ERRORs on !TransactionIdIsNormal(),
>> which I think is wrong. Attached is a patch to make it return 0 for
>> FrozenTransactionId and BootstrapTransactionId, like it does for xids
>> that are too old.
>
> Why? It seems quite correct to not allow lookups for special case
> values, as it seems sensible to give them special treatmeant at the call
> site?

It's surprising behaviour that doesn't make sense. Look at it this way:

- We do some work, generating rows that have commit timestamps
- TransactionIdGetCommitTsData() on those rows returns their cts fine
- The commit timestamp data ages out
- TransactionIdGetCommitTsData() returns 0 on these rows
- vacuum comes alone and freezes the rows, even though nothing's changed
- TransactionIdGetCommitTsData() suddenly ERRORs

Nothing has meaningfully changed on these rows. They have gone from
"old, committed, past the commit timestamp threshold" to "old,
commited, past the commit timestamp threshold, frozen".

It makes no sense to ERROR when vacuum gets around to freezing the
tuples, when we don't also ERROR when we pass the cts threshold.

ERRORing on BootstrapTransactionId is slightly more reasonable since
those rows can never have had a cts in the first place, but it's also
unnecessary since they're effectively "oldest always-committed xids".

Making it ERROR on FrozenTransactionId was a mistake and should be corrected.

>> IMO this should be back-patched to 9.6 and, without the TAP test part,
>> to 9.5.
>
> Why would we want to backpatch a behaviour change, where arguments for
> the current and proposed behaviour exists?

I don't think it's crucial since callers can just work around it, but
IMO the current behaviour is a design oversight that should be
corrected and can be safely and sensibly corrected. Nobody's going to
rely on FrozenTransactionId ERRORing.

I don't think a backpatch is crucial though; as you note, C-level
callers can work around the problem pretty simply, and that's just
what I've done in pglogical for existing versions. I just think it's
ugly, should be fixed, and is safe to fix.

It's slightly harder for SQL-level callers to work around since they
must hardcode a CASE that tests for xmin = XID '1' OR xmin = XID '2',
and it's much less reasonable to expect SQL level callers to deal with
this sort of mess with low level state.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-11-24 01:26:16 Re: Physical append-only tables
Previous Message Andrew Dunstan 2016-11-23 23:44:16 Re: patch: function xmltable