Re: archive status ".ready" files may be created too early

From: "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "x4mmm(at)yandex-team(dot)ru" <x4mmm(at)yandex-team(dot)ru>, "a(dot)lubennikova(at)postgrespro(dot)ru" <a(dot)lubennikova(at)postgrespro(dot)ru>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>, "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: archive status ".ready" files may be created too early
Date: 2021-08-20 21:38:16
Message-ID: 202108202138.scohxwoyflcp@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-Aug-20, Bossart, Nathan wrote:

> > On Fri, Aug 20, 2021 at 1:29 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:

> >> This led me to revisit the two-element
> >> approach that was discussed upthread. What if we only stored the
> >> earliest and latest segment boundaries at any given time? Once the
> >> earliest boundary is added, it never changes until the segment is
> >> flushed and it is removed. The latest boundary, however, will be
> >> updated any time we register another segment. Once the earliest
> >> boundary is removed, we replace it with the latest boundary. This
> >> strategy could cause us to miss intermediate boundaries, but AFAICT
> >> the worst case scenario is that we hold off creating .ready files a
> >> bit longer than necessary.

> I've attached a patch to demonstrate what I'm thinking.

There is only one thing I didn't like in this new version, which is that
we're holding info_lck too much. I've seen info_lck contention be a
problem in some workloads and I'd rather not add more stuff to it. I'd
rather we stick with using a new lock object to protect all the data we
need for this job.

Should this new lock object be a spinlock or an lwlock? I think a
spinlock would generally be better because it's lower overhead and we
can't use it in shared mode anywhere, which would be the greatest
argument for an lwlock. However, I think we avoid letting code run with
spinlocks held that's not straight-line code, and we have some function
calls there.

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message alvherre@alvh.no-ip.org 2021-08-20 23:00:14 Re: archive status ".ready" files may be created too early
Previous Message Robert Haas 2021-08-20 20:30:38 Re: The Free Space Map: Problems and Opportunities