Timeline following for logical slots

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Timeline following for logical slots
Date: 2016-03-01 13:00:45
Message-ID: CAMsr+YH-C1-X_+s=2nzAPnR0wwqJa-rUmVHSYyZaNSn93MUBMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

Per discussion on the failover slots thread (
https://commitfest.postgresql.org/9/488/) I'm splitting timeline following
for logical slots into its own separate patch.

The attached patch fixes an issue I found while testing the prior revision:
it would read WAL from WAL segments on the old timeline up until the
timeline switch boundary, but this doesn't work if the last WAL segment on
the timeline has been renamed to append the .partial suffix.

Instead it's necessary to eagerly switch to reading the WAL segment from
the newest timeline on that segment. We'll still be reading WAL records
from the correct timeline since the partial WAL segment from the old
timeline gets copied to a new name on promotion, but we're reading it from
the newest copy of that segment, which is either complete and archived or
is still being written to by the current timeline.

For example, if the old master was on timeline 1 and writing
to 000000010000000000000003 when it dies and we promote a streaming
replica, the replica will copy 000000010000000000000003
to 000000020000000000000003 and append its recovery checkpoint to the copy.
It renames 000000010000000000000003 to 000000010000000000000003.partial,
which means the xlogreader won't find it. If we're reading the record at
0/3000000 then even though 0/3000000 is on timeline 1, we have to read it
from the segment on timeline 2.

Fun, eh?

(I'm going to write a README.timelines to document some of this stuff soon,
since it has some pretty hairy corners and some of the code paths are a bit
special.)

I've written some initial TAP tests for timeline following that exploit the
fact that replication slots are preserved on a replica if the replica is
created with a filesystem level copy that includes pg_replslot, rather than
using pg_basebackup. They are not included here because they rely on TAP
support improvements (filesystem backup support, psql enhancements, etc)
that I'll submit separately, but they're how I found the .partial issue.

A subsequent patch can add testing of slot creation and advance on replicas
using a C test extension to prove that this approach can be used to achieve
practical logical failover for extensions.

I think this is ready to go as-is.

I don't want to hold it up waiting for test framework enhancements unless
those can be committed fairly easily because I think we need this in 9.6
and the tests demonstrate that it works when run separately.

See for a git tree containing the timeline following patch, TAP
enhancements and the tests for timeline following.

https://github.com/2ndQuadrant/postgres/tree/dev/logical-decoding-timeline-following

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Allow-logical-slots-to-follow-timeline-switches.patch text/x-patch 20.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-03-01 13:05:33 Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Previous Message Michael Paquier 2016-03-01 13:00:23 Equivalent of --enable-tap-tests in MSVC scripts