[PATCH] Logical decoding timeline following take II

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] Logical decoding timeline following take II
Date: 2016-09-01 04:08:49
Message-ID: CAMsr+YEQB3DbxmCUTTTX4RZy8J2uGrmb5+_ar_joFZNXa81Fug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all

Attached is a rebased and updated logical decoding timeline following
patch for 10.0.

This is a pre-requisite for the pending work on logical decoding on
standby servers and simplified failover of logical decoding.

Restating the commit message:
__________

Follow timeline switches in logical decoding

When decoding from a logical slot, it's necessary for xlog reading to
be able to read xlog from historical (i.e. not current) timelines.
Otherwise decoding fails after failover to a physical replica because
the oldest still-needed archives are in the historical timeline.

Supporting logical decoding timeline following is a pre-requisite for
logical decoding on physical standby servers. It also makes it
possible to promote a replica with logical slots to a master and
replay from those slots, allowing logical decoding applications to
follow physical failover.

Logical slots cannot actually be created on a replica without use of
the low-level C slot management APIs so this is mostly foundation work
for subsequent changes to enable logical decoding on standbys.

This commit includes a module in src/test/modules with functions to
manipulate the slots (which is not otherwise possible in SQL code) in
order to enable testing, and a new test in src/test/recovery to ensure
that the behavior is as expected.

Note that an earlier version of logical decoding timeline following
was committed to 9.5 as 24c5f1a103ce, 3a3b309041b0, 82c83b337202, and
f07d18b6e94d. It was then reverted by c1543a81a7a8 just after 9.5
feature freeze when issues were discovered too late to safely fix them
in the 9.5 release cycle.

The prior approach failed to consider that a record could be split
across pages that are on different segments, where the new segment
contains the start of a new timeline. In that case the old segment
might be missing or renamed with a .partial suffix.

This patch reworks the logic to be page-based and in the process
simplify how the last timeline for a segment is looked up.

Slot timeline following only works in a backend. Frontend support can
be aded separately, where it could be useful for pg_xlogdump etc once
support for timeline.c, List, etc is added for frontend code.
__________

I'm hoping to find time to refactor timeline following so that we
avoid passing timeline information around the xlogreader using
globals, but that'd be a separate change that can be made after this.

I've omitted the --endpos changes for pg_recvlogical, which again can
be added separately.

The test harness code will become unnecessary when proper support for
logical failover or logical decoding on standby is added, so I'm not
really sure it should be committed.

Prior threads:

* https://www.postgresql.org/message-id/CAMsr+YG_1FU_-L8QWSk6oKFT4Jt8dpORy2RHXDyMy0B5ZfkpGA@mail.gmail.com

* https://www.postgresql.org/message-id/CAMsr+YH-C1-X_+s=2nzAPnR0wwqJa-rUmVHSYyZaNSn93MUBMQ@mail.gmail.com

* http://www.postgresql.org/message-id/20160503165812.GA29604@alvherre.pgsql

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Follow-timeline-switches-in-logical-decoding.patch text/x-patch 38.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-01 04:13:33 Re: WAL consistency check facility
Previous Message Amit Kapila 2016-09-01 03:28:19 Re: WAL consistency check facility