Re: pg_rewind docs correction

From: James Coleman <jtc331(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_rewind docs correction
Date: 2019-09-17 12:38:18
Message-ID: CAAaqYe9jUiw5ojLn1XDepNUJ1BcN60-nsXipN6FCgYK6w1K-0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 17, 2019 at 3:51 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sun, Sep 15, 2019 at 10:36:04AM -0400, James Coleman wrote:
> > On Sun, Sep 15, 2019 at 10:25 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >> + The rewind operation is not expected to result in a consistent data
> >> + directory state either internally to the node or with respect to the rest
> >> + of the cluster. Instead the resulting data directory will only be consistent
> >> + after WAL replay has completed to at least the LSN at which changed blocks
> >> + copied from the source were originally written on the source.
> >>
> >> That's not necessarily true. pg_rewind enforces in the control file
> >> of the target the minimum consistency LSN to be
> >> pg_current_wal_insert_lsn() when using a live source or the last
> >> checkpoint LSN for a stopped source, so while that sounds true from
> >> the point of view of all the blocks copied, the control file may still
> >> cause a complain that the target recovering has not reached its
> >> consistent point even if all the blocks are already at a position
> >> not-so-far from what has been registered in the control file.
> >
> > I could just say "after WAL replay has completed to a consistent state"?
>
> I still would not change this paragraph. The first sentence means
> that we have an equivalency, because that's the case if you think
> about it as we make sure that the target is able to sync with the
> source, and the target gets into a state where it as an on-disk state
> equivalent to the target up to the minimum consistency point defined
> in the control file once the tool has done its work (this last point
> is too precise to be included in a global description to be honest).
> And the second sentence makes clear what are the actual diffs are.
> >> + the point at which the WAL timelines of the source and target diverged plus
> >> + the current state on the source of any blocks changed on the target after
> >> + that divergence. While only changed blocks from existing relation files are
> >>
> >> And here we could mention that all the blocks copied from the source
> >> are the ones which are found in the WAL records of the target until
> >> the end of WAL of its timeline. Still, that's basically what is
> >> mentioned in the first part of "How It Works", which explains things
> >> better. I honestly don't really see that all this paragraph is an
> >> improvement over the simplicity of the original when it comes to
> >> understand the global idea of what pg_rewind does.
> >
> > The problem with the original is that while simple, it's actually
> > incorrect in that simplicity. Pg_rewind does *not* result in the data
> > directory on the target matching the data directory on the source.
>
> That's not what I get from the original docs, but I may be too much
> used to it.

I don't agree that that's a valid equivalency. I myself spent a lot of
time trying to understand how this could possibly be true a while
back, and even looked at source code to be certain. I've asked other
people and found the same confusion.

As I read it the 2nd second sentence doesn't actually tell you the
differences; it makes a quick attempt at summarizing *how* the first
sentence is true, but if the first sentence isn't accurate, then it's
hard to read the 2nd one as helping.

If you'd prefer something less detailed at this point at that point in
the docs, then something along the lines of "results in a data
directory state which can then be safely replayed from the source" or
some such.

The docs shouldn't be correct just for someone how already understands
the intricacies. And the end user shouldn't have to read the "how it
works" (which incidentally is kinda hidden at the bottom underneath
the CLI args -- perhaps we could move that?) to extrapolate things in
the primary documentation.

James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-09-17 13:07:53 Re: pgbench - allow to create partitioned tables
Previous Message James Coleman 2019-09-17 12:18:41 Re: [DOC] Document auto vacuum interruption