Re: archive status ".ready" files may be created too early

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: bossartn(at)amazon(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: archive status ".ready" files may be created too early
Date: 2019-12-13 21:33:44
Message-ID: 20191213213344.GA14295@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-Dec-13, Kyotaro Horiguchi wrote:

> At Thu, 12 Dec 2019 22:50:20 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in

> > The crux of the issue seems to be that XLogWrite() does not wait for
> > the entire record to be written to disk before creating the ".ready"
> > file. Instead, it just waits for the last page of the segment to be
> > written before notifying the archiver. If PostgreSQL crashes before
> > it is able to write the rest of the record, it will end up reusing the
> > ".ready" segment at the end of crash recovery. In the meantime, the
> > archiver process may have already processed the old version of the
> > segment.
>
> Year, that can happen if the server restarted after the crash.

... which is the normal way to run things, no?

> > servers after the primary server has crashed because it ran out of
> > disk space.
>
> In the first place, it's quite bad to set restart_after_crash to on,
> or just restart crashed master in replication set.

Why is it bad? It's the default value.

> The standby can be incosistent at the time of master crash, so it
> should be fixed using pg_rewind or should be recreated from a base
> backup.

Surely the master will just come up and replay its WAL, and there should
be no inconsistency.

You seem to be thinking that a standby is promoted immediately on crash
of the master, but this is not a given.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-12-13 22:47:35 Re: error context for vacuum to include block number
Previous Message Bossart, Nathan 2019-12-13 21:24:36 Re: archive status ".ready" files may be created too early