Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot
Date: 2018-10-22 17:41:55
Message-ID: 20181022174155.wxxu2qt3bcrbrqy5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2018-10-22 12:36:25 -0300, Alvaro Herrera wrote:
> On 2018-Oct-14, Andres Freund wrote:
>
> > On 2018-10-14 13:26:24 +0000, Michael Paquier wrote:
> > > Avoid duplicate XIDs at recovery when building initial snapshot
>
> > I'm unhappy this approach was taken over objections. Without a real
> > warning. Even leaving the crummyness aside, did you check other users
> > of XLOG_RUNNING_XACTS, e.g. logical decoding?
>
> Mumble. Is there a real problem here -- I mean, did this break logical
> decoding? Maybe we need some more tests to ensure this stuff works
> sanely for logical decoding.

Hm? My point is that this fix just puts a band-aid onto *one* of the
places that read a XLOG_RUNNING_XACTS. Which still leaves the contents
of WAL record corrupted. There's not even a note at the WAL-record's
definition or its logging denoting that the contents are not what you'd
expect. I don't mean that the fix would break logical decoding, but
that it's possible that an equivalent of the problem affecting hot
standby also affects logical decoding. And even leaving those two users
aside, it's possible that there will be further vulernable internal
users or extensions parsing the WAL.

> FWIW and not directly related, I recently became aware (because of a
> customer question) that txid_current_snapshot() is oblivious of
> XLOG_RUNNING_XACTS in a standby. AFAICS it's not a really serious
> concern, but it was surprising.

That's more fundamental than just XLOG_RUNNING_XACTS though, no? The
whole way running transactions (i.e. including those that are just
detected by looking at their xid) are handled in the known xids struct
and in snapshots seems incompatible with that, no?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2018-10-22 18:04:35 Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot
Previous Message Alvaro Herrera 2018-10-22 15:36:25 Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2018-10-22 17:54:19 Re: [PATCH] XLogReadRecord returns pointer to currently read page
Previous Message legrand legrand 2018-10-22 17:28:14 Multiple Wait Events for extensions