Re: archive status ".ready" files may be created too early

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "x4mmm(at)yandex-team(dot)ru" <x4mmm(at)yandex-team(dot)ru>, "a(dot)lubennikova(at)postgrespro(dot)ru" <a(dot)lubennikova(at)postgrespro(dot)ru>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>, "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: archive status ".ready" files may be created too early
Date: 2021-08-20 17:29:22
Message-ID: 4CB4C628-EBB8-4E7C-9255-A125A3EB7C2B@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/20/21, 10:08 AM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Aug 20, 2021 at 12:36 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
>> If a record spans multiple segments, we only register one segment
>> boundary. For example, if I insert a record that starts at segment
>> number 1 and stops at 10, I'll insert one segment boundary for segment
>> 10. We'll only create .ready files for segments 1 through 9 once this
>> record is completely flushed to disk.
>
> Oh ... OK. So is there any experimental scenario in which the hash
> table ends up with more than 1 entry? And if so, how does that happen?

I was able to do this by turning synchronous_commit off, increasing
wal_buffers substantially, and adding sleeps to XLogWrite().

>> If there isn't a way to ensure that the number of entries we need to
>> store is bounded, I'm tempted to propose my original patch [0], which
>> just moves .ready file creation to the very end of XLogWrite(). It's
>> probably not a complete solution, but it might be better than what's
>> there today.
>
> Doesn't that allocate memory inside a critical section? I would have
> thought it would cause an immediate assertion failure.

I could probably replace the list with two local variables (start and
end segments).

Thinking about this stuff further, I was wondering if one way to
handle the bounded shared hash table problem would be to replace the
latest boundary in the map whenever it was full. But at that point,
do we even need a hash table? This led me to revisit the two-element
approach that was discussed upthread. What if we only stored the
earliest and latest segment boundaries at any given time? Once the
earliest boundary is added, it never changes until the segment is
flushed and it is removed. The latest boundary, however, will be
updated any time we register another segment. Once the earliest
boundary is removed, we replace it with the latest boundary. This
strategy could cause us to miss intermediate boundaries, but AFAICT
the worst case scenario is that we hold off creating .ready files a
bit longer than necessary.

I'll work on a patch to illustrate what I'm thinking.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-08-20 17:32:33 Re: The Free Space Map: Problems and Opportunities
Previous Message Robert Haas 2021-08-20 17:24:29 Re: Improving some plpgsql error messages