Re: .ready and .done files considered harmful

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bossartn(at)amazon(dot)com
Cc: dipesh(dot)pandit(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, jeevan(dot)ladhe(at)enterprisedb(dot)com, sfrost(at)snowman(dot)net, andres(at)anarazel(dot)de, hannuk(at)google(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: .ready and .done files considered harmful
Date: 2021-09-15 01:47:57
Message-ID: 20210915.104757.462650766526515461.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 14 Sep 2021 18:07:31 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> On 9/14/21, 9:18 AM, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote:
> > This is an interesting idea, but the "else" block here seems prone to
> > race conditions. I think we'd have to hold arch_lck to prevent that.
> > But as I mentioned above, if we are okay with depending on the
> > fallback directory scans, I think we can remove the "else" block
> > completely.
>
> Thinking further, we probably need to hold a lock even when we are
> creating the .ready file to avoid race conditions.

The race condition surely happens, but even if that happens, all
competing processes except one of them detect out-of-order and will
enforce directory scan. But I'm not sure how it behaves under more
complex situation so I'm not sure I like that behavior.

We could just use another lock for the logic there, but instead
couldn't we merge PgArchGetLastReadySegNo and PgArchSetLastReadySegNo
into one atomic test-and-(check-and-)set function? Like this.

====
XLogFromFileName(xlog, &tli, &this_segno, wal_segment_size);
if (!PgArchReadySegIsInOrder(this_segno))
PgArchForceDirScan();

bool
PgArchReadySegIsInOrder(XLogSegNo this_segno)
{
bool in_order = true;

SpinLockAcquire(&PgArch->arch_lck);
if (PgArch->lastReadySegNo + 1 != this_segno)
in_order = false;
PgArch->lastReadySegNo = this_segno;
SpinLockRelease(&PgArch->arch_lck);

return in_order;
}
====

By the way, it seems to me that we only need to force directory scan
when notification seg number steps back. If this is correct, we can
reduce the number how many times we enforce directory scans.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-09-15 03:00:56 Re: prevent immature WAL streaming
Previous Message Alvaro Herrera 2021-09-15 01:32:04 Re: prevent immature WAL streaming