Re: Logical decoding timeline following fails to handle records split across segments

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Logical decoding timeline following fails to handle records split across segments
Date: 2016-05-04 10:00:06
Message-ID: CAMsr+YGnax8LeXwDOyxppjqe962hfJ=D3fGpTe7p=phx34GCiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3 May 2016 at 22:03, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> Hi all
>
> There's a bug (mine) in logical decoding timeline following where reading
> the first page from the segment containing a timeline switch fails to read
> from the most recent timeline in that segment. This is harmless if the old
> timeline's copy of the segment is present - but if it's been renamed to
> .partial, deleted or never copied over to a replica then decoding will
> complain that the required segment has already been removed. Just like
> without timeline following.
>
> The underlying problem is that timeline calculations used the record's
> start pointer and didn't properly consider continuations; they were
> record-based, not page-based like they should be.
>
> A corrected and handily much, much simpler patch is attached. The logic
> for finding the last timeline on a segment was massively more complex than
> it needed to be, and that wasn't the only thing.
>

For the record the patch this fixes got reverted as agreed in
http://www.postgresql.org/message-id/20160503165812.GA29604@alvherre.pgsql .

I will submit this patch to 9.7 along with the improvements to
pg_recvlogical and expanded test suite.

I then expect to follow on with work to clean up the use of globals to pass
timeline info through xlogreader to read page callbacks, and hopefully the
hs protocol changes etc required to allow the improved slot failover
support mechanism Petr, Andres and I discussed to work.

This patch as attached won't apply anymore, but it's trivial to apply it on
top of a cherry-picked copy of the reverted feature patch for testing or
further development.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Ignatov 2016-05-04 10:10:45 Re: Is pg_control file crashsafe?
Previous Message Stephen Frost 2016-05-04 09:29:12 Re: pg_dump broken for non-super user