Re: Why standby restores some WALs many times from archive?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Sergey Burladyan <eshkinkot(at)gmail(dot)com>, Victor Yagofarov <xnasx(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why standby restores some WALs many times from archive?
Date: 2018-01-10 19:45:38
Message-ID: CAMkU=1wkV-Kp2XeWSWH5Kn=eUJt2di0vHAp=MGuOLbkGSyMK3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 30, 2017 at 4:20 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Sat, Dec 30, 2017 at 04:30:07AM +0300, Sergey Burladyan wrote:
> > We use this scripts:
> > https://github.com/avito-tech/dba-utils/tree/master/pg_archive
> >
> > But I can reproduce problem with simple cp & mv:
> > archive_command:
> > test ! -f /var/lib/postgresql/wals/%f && \
> > test ! -f /var/lib/postgresql/wals/%f.tmp && \
> > cp %p /var/lib/postgresql/wals/%f.tmp && \
> > mv /var/lib/postgresql/wals/%f.tmp /var/lib/postgresql/wals/%f
>
> This is unsafe. PostgreSQL expects the WAL segment archived to be
> flushed to disk once the archive command has returned its result to the
> backend. Don't be surprised if you get corrupted instances or that you
> are not able to recover up to a consistent point if you need to roll in
> a backup. Note that the documentation of PostgreSQL provides a simple
> example of archive command, which is itself bad enough not to use.
>

True, but that but doesn't explain the current situation, as it reproduces
without an OS level crash so a missing sync would not be relevant. (and on
some systems, mv'ing a file will force it to be synced under some
conditions, so it might be safe anyway)

I thought I'd seen something recently in the mail lists or commit log about
an off-by-one error which causes it to re-fetch the previous file rather
than the current file if the previous file ends with just the right type of
record and amount of padding. But now I can't find it.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-01-10 19:55:54 Re: let's make the list of reportable GUCs configurable (was Re: Add %r substitution for psql prompts to show recovery status)
Previous Message Andres Freund 2018-01-10 19:34:07 Re: Dubious shortcut in ckpt_buforder_comparator()