Re: archive status ".ready" files may be created too early

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bossartn(at)amazon(dot)com
Cc: a(dot)lubennikova(at)postgrespro(dot)ru, hlinnaka(at)iki(dot)fi, matsumura(dot)ryo(at)fujitsu(dot)com, masao(dot)fujii(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: archive status ".ready" files may be created too early
Date: 2020-12-18 05:14:49
Message-ID: 20201218.141449.1545635102536215724.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 17 Dec 2020 22:20:35 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> On 12/15/20, 2:33 AM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > You're right in that regard. There's a window where partial record is
> > written when write location passes F0 after insertion location passes
> > F1. However, remembering all spanning records seems overkilling to me.
>
> I'm curious why you feel that recording all cross-segment records is
> overkill. IMO it seems far simpler to just do that rather than try to

Sorry, my words are not enough. Remembering all spanning records in
*shared memory* seems to be overkilling. Much more if it is stored in
shared hash table. Even though it rarely the case, it can fail hard
way when reaching the limit. If we could do well by remembering just
two locations, we wouldn't need to worry about such a limitation.

> reason about all these different scenarios and rely on various
> (and possibly fragile) assumptions. You only need to record the end

After the previous mail sent, I noticed that the assumption on
record-length was not needed. So that way no longer need any of the
assumption^^;

> location of records that cross into the next segment (or that fit
> perfectly into the end of the current one) and to evaluate which
> segments to mark .ready as the "flushed" LSN advances. I'd expect
> that in most cases we wouldn't need to store more than a couple of
> record boundaries, so it's not like we'd normally be storing dozens of
> boundaries. Even if we did need to store several boundaries, AFAICT
> the approach I'm proposing should still work well enough.

I didn't say it doesn't work, just overkill.

Another concern about the concrete patch:

NotifySegmentsReadyForArchive() searches the shared hashacquiaing a
LWLock every time XLogWrite is called while segment archive is being
held off. I don't think it is acceptable and I think it could be a
problem when many backends are competing on WAL.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-12-18 05:42:05 Re: archive status ".ready" files may be created too early
Previous Message Dilip Kumar 2020-12-18 04:38:00 Re: Parallel Inserts in CREATE TABLE AS