Re: .ready and .done files considered harmful

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-09-02 21:52:22
Message-ID: AC78607B-9DA6-41F4-B253-840D3DD964BF@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/2/21, 6:22 AM, "Dipesh Pandit" <dipesh(dot)pandit(at)gmail(dot)com> wrote:
> I agree that multiple-files-pre-readdir is cleaner and has the resilience of the
> current implementation. However, I have a few suggestion on keep-trying-the
> -next-file approach patch shared in previous thread.

Which approach do you think we should use? I think we have decent
patches for both approaches at this point, so perhaps we should see if
we can get some additional feedback from the community on which one we
should pursue further.

> + /* force directory scan the first time we call pgarch_readyXlog() */
> + PgArchForceDirScan();
> +
>
> We should not force a directory in pgarch_ArchiverCopyLoop(). This gets called
> whenever archiver wakes up from the wait state. This will result in a
> situation where the archiver performs a full directory scan despite having the
> accurate information about the next anticipated log segment.
> Instead we can check if lastSegNo is initialized and continue directory scan
> until it gets initialized in pgarch_readyXlog().

The problem I see with this is that pgarch_archiveXlog() might end up
failing. If it does, we won't retry archiving the file until we do a
directory scan. I think we could try to avoid forcing a directory
scan outside of these failure cases and archiver startup, but I'm not
sure it's really worth it. When pgarch_readyXlog() returns false, it
most likely means that there are no .ready files present, so I'm not
sure we are gaining a whole lot by avoiding a directory scan in that
case. I guess it might help a bit if there are a ton of .done files,
though.

> + return lastSegNo;
> We should return "true" here.

Oops. Good catch.

> I am thinking if we can add a log message for files which are
> archived as part of directory scan. This will be useful for diagnostic purpose
> to check if desired files gets archived as part of directory scan in special
> scenarios. I also think that we should add a few comments in pgarch_readyXlog().

I agree, but it should probably be something like DEBUG3 instead of
LOG.

> + /*
> + * We must use <= because the archiver may have just completed a
> + * directory scan and found a later segment (but hasn't updated
> + * shared memory yet).
> + */
> + if (this_segno <= arch_segno)
> + PgArchForceDirScan();
>
> I still think that we should use '<' operator here because
> arch_segno represents the segment number of the most recent
> .ready file found by the archiver. This gets updated in shared
> memory only if archiver has successfully found a .ready file.
> In a normal scenario this_segno will be greater than arch_segno
> whereas in cases where a .ready file is created out of order
> this_segno may be less than arch_segno. I am wondering
> if there is a scenario where arch_segno is equal to this_segno
> unless I am missing something.

The pg_readyXlog() logic looks a bit like this:

1. Try to skip directory scan. If that succeeds, we're done.
2. Do a directory scan.
3. If we found a regular WAL file, update PgArch and return
what we found.

Let's say step 1 looks for WAL file 10, but 10.ready doesn't exist
yet. The following directory scan ends up finding 11.ready. Just
before we update the PgArch state, XLogArchiveNotify() is called and
creates 10.ready. However, pg_readyXlog() has already decided to
return WAL segment 11 and update the state to look for 12 next. If we
just used '<', we won't force a directory scan, and segment 10 will
not be archived until the next one happens. If we use '<=', I don't
think we have the same problem.

I've also thought about another similar scenario. Let's say step 1
looks for WAL file 10, but it doesn't exist yet (just like the
previous example). The following directory scan ends up finding
12.ready, but just before we update PgArch, we create 11.ready. In
this case, we'll still skip forcing a directory scan until 10.ready is
created later on. I believe it all eventually works out as long as we
can safely assume that all files that should have .ready files will
eventually get them.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-09-02 22:07:02 Re: Read-only vs read only vs readonly
Previous Message Phil Krylov 2021-09-02 21:36:13 [PATCH] pg_ctl should not truncate command lines at 1024 characters