Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot
Date: 2018-10-22 22:15:38
Message-ID: 20181022221538.d73yxdypbmmzmi2l@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2018-Oct-22, Andres Freund wrote:

> Hi,
>
> On 2018-10-22 12:36:25 -0300, Alvaro Herrera wrote:
> > On 2018-Oct-14, Andres Freund wrote:
> >
> > > On 2018-10-14 13:26:24 +0000, Michael Paquier wrote:
> > > > Avoid duplicate XIDs at recovery when building initial snapshot
> >
> > > I'm unhappy this approach was taken over objections. Without a real
> > > warning. Even leaving the crummyness aside, did you check other users
> > > of XLOG_RUNNING_XACTS, e.g. logical decoding?
> >
> > Mumble. Is there a real problem here -- I mean, did this break logical
> > decoding? Maybe we need some more tests to ensure this stuff works
> > sanely for logical decoding.
>
> Hm? My point is that this fix just puts a band-aid onto *one* of the
> places that read a XLOG_RUNNING_XACTS. Which still leaves the contents
> of WAL record corrupted. There's not even a note at the WAL-record's
> definition or its logging denoting that the contents are not what you'd
> expect. I don't mean that the fix would break logical decoding, but
> that it's possible that an equivalent of the problem affecting hot
> standby also affects logical decoding. And even leaving those two users
> aside, it's possible that there will be further vulernable internal
> users or extensions parsing the WAL.

Ah! I misinterpreted what you were saying. I agree we shouldn't let
the WAL message have wrong data. Of course we shouldn't just add a
code comment stating that the data is wrong :-)

> > FWIW and not directly related, I recently became aware (because of a
> > customer question) that txid_current_snapshot() is oblivious of
> > XLOG_RUNNING_XACTS in a standby. AFAICS it's not a really serious
> > concern, but it was surprising.
>
> That's more fundamental than just XLOG_RUNNING_XACTS though, no? The
> whole way running transactions (i.e. including those that are just
> detected by looking at their xid) are handled in the known xids struct
> and in snapshots seems incompatible with that, no?

hmm ... as I recall, txid_current_snapshot also only considers xacts by
xid, so read-only xacts are not considered -- seems to me that if you
think of snapshots in a general way, you're right, but for whatever you
want txid_current_snapshot for (and I don't quite remember what that is)
then it's not really important.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2018-10-23 01:43:38 Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot
Previous Message Andres Freund 2018-10-22 18:04:35 Re: pgsql: Avoid duplicate XIDs at recovery when building initial snapshot

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-10-22 22:36:19 Re: Speeding up INSERTs and UPDATEs to partitioned tables
Previous Message David Rowley 2018-10-22 20:31:19 Re: Small performance tweak to run-time partition pruning