Re: .ready and .done files considered harmful

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-08-18 14:23:34
Message-ID: CA+TgmoY4Tu_a=4hTX88BrWCP-OdnasAxkk2Em3c_gTLqhUy9pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 17, 2021 at 4:19 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> Thinking further, I think the most important thing to ensure is that
> resetting the flag happens before we begin the directory scan.
> Consider the following scenario in which a timeline history file would
> potentially be lost:
>
> 1. Archiver completes directory scan.
> 2. A timeline history file is created and the flag is set.
> 3. Archiver resets the flag.

Dipesh says in his latest email that the archiver resets the flag just
before it begins a directory scan. If that's accurate, then I think
this sequence of events can't occur.

If there is a race condition here with setting the flag, then an
alternative design would be to use a counter - either a plain old
uint64 or perhaps pg_atomic_uint64 - and have the startup process
increment the counter when it wants to trigger a scan. In this design,
the archiver would never modify the counter itself, but just remember
the last value that it saw. If it later sees a different value it
knows that a full scan is required. I think this kind of system is
extremely robust against the general class of problems that you're
talking about here, but I'm not sure whether we need it, because I'm
not sure whether there is a race with just the bool.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-08-18 14:24:09 Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Previous Message Tom Lane 2021-08-18 14:21:03 Re: NAMEDATALEN increase because of non-latin languages