Re: Add timeline to partial WAL segments

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add timeline to partial WAL segments
Date: 2018-12-11 01:43:17
Message-ID: 20181211014317.GC1473@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Mon, Dec 10, 2018 at 10:21:23AM -0500, David Steele wrote:
> We recommend that archive commands not overwrite an existing segment.
> Some backup tools will compare the contents and succeed if they are
> equal, but in this case that will still often fail because recycled WAL
> segments will have different bytes at the end on the primary and
> standby. The files may not even be logically the same because B may not
> have received all WAL from A.

This is not a new problem, the last, partial segment generated
post-promotion of a timeline needs to be archived. Since the
introduction of .partial within the segment name in 9.5, we also assume
that the OP would be smart enough to rename the segment to replay up to
the end of the past timeline if need be for PITR.

> However, there is still a race condition here. Since the
> 000000010000000100000001.partial is archived first the 00000002.history
> file might not make it to the archive before B crashes. In that case A
> will pick timeline 2 and still be stuck. However, I'm thinking it would
> be easy to teach pgarch_readyXlog() to return any .history files it
> finds first (in order, of course).

Still the .ready file of the partial segment would be generated before
the history file, right? In what does that help?

> Another option would be to immediately archive the first WAL segment on
> timeline 2 and forgo the .partial file entirely. In this case the
> archiver will archive the 00000002.history file before
> 000000020000000100000001 and we avoid the race condition above. That
> also means we could recover A and promote without a conflict on the
> .partial. Or we could recover A along timeline 2.

This breaks the definition of IsPartialXLogFileName() in
xlog_internal.h, and the current naming convention of using only dots as
field separators. Another more tricky problem is that this is
inconsistent with the way pg_receivewal.c behaves for non-completed
segments, which is a reason behind using .partial for the last partial
segment on the backend side as well. So this proposal makes things more
inconsistent.

> I have attached a patch that adds the timeline to the .partial file.
> This passes check-world.
>
> I think we should consider back-patching some set of these changes since
> this causes real pain in current production HA configurations.
>
> Thoughts?

So you basically append the new timeline ID to the segment name which
still uses the old timeline ID in the first 8 characters of its name.
Logically I find this proposal weird as the segment refers contents
which are part of the past, and the backend is not going to use the
contents of this segment when jumping to the a new timeline, but the
contents of the segment which has the same contents up to the point WAL
forked, with the name of the new timeline.

It seems to me that this is quite a change for a low-probability
problem, as this assumes that the promotion of two different servers
happen on exactly the same segment and that both would finish by
archiving the same last partial segment.

Putting aside this proposal, it would be actually nice to put in
xlog_internal.h a macro which is able to write a partial file name,
close to the same place where we check if the segment name refers to a
partial segment.

My 2c.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-12-11 01:47:47 Re: Pluggable Storage - Andres's take
Previous Message Greg Stark 2018-12-10 23:53:50 Re: Thinking about EXPLAIN ALTER TABLE