Re: .ready and .done files considered harmful

From: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-09-02 13:20:48
Message-ID: CAN1g5_Fgy04uBtjD6SQZDGyouUFXnpr4=rfNma9yF0FybvT8nA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the feedback.

> I attached two patches that demonstrate what I'm thinking this change
> should look like. One is my take on the keep-trying-the-next-file
> approach, and the other is a new version of the multiple-files-per-
> readdir approach (with handling for "cheating" archive commands). I
> personally feel that the multiple-files-per-readdir approach winds up
> being a bit cleaner and more resilient than the keep-trying-the-next-
> file approach. However, the keep-trying-the-next-file approach will
> certainly be more efficient (especially for the extreme cases
> discussed in this thread), and I don't have any concrete concerns with
> this approach that seem impossible to handle.

I agree that multiple-files-pre-readdir is cleaner and has the resilience
of the
current implementation. However, I have a few suggestion on keep-trying-the
-next-file approach patch shared in previous thread.

+ /* force directory scan the first time we call pgarch_readyXlog() */
+ PgArchForceDirScan();
+

We should not force a directory in pgarch_ArchiverCopyLoop(). This gets
called
whenever archiver wakes up from the wait state. This will result in a
situation where the archiver performs a full directory scan despite having
the
accurate information about the next anticipated log segment.
Instead we can check if lastSegNo is initialized and continue directory
scan
until it gets initialized in pgarch_readyXlog().

+ return lastSegNo;
We should return "true" here.

I am thinking if we can add a log message for files which are
archived as part of directory scan. This will be useful for diagnostic
purpose
to check if desired files gets archived as part of directory scan in
special
scenarios. I also think that we should add a few comments in
pgarch_readyXlog().

I have incorporated these changes and attached a patch
v1-0001-keep-trying-the-next-file-approach.patch.

+ /*
+ * We must use <= because the archiver may have just completed a
+ * directory scan and found a later segment (but hasn't updated
+ * shared memory yet).
+ */
+ if (this_segno <= arch_segno)
+ PgArchForceDirScan();

I still think that we should use '<' operator here because
arch_segno represents the segment number of the most recent
.ready file found by the archiver. This gets updated in shared
memory only if archiver has successfully found a .ready file.
In a normal scenario this_segno will be greater than arch_segno
whereas in cases where a .ready file is created out of order
this_segno may be less than arch_segno. I am wondering
if there is a scenario where arch_segno is equal to this_segno
unless I am missing something.

Thanks,
Dipesh

Attachment Content-Type Size
v1-0001-keep-trying-the-next-file-approach.patch text/x-patch 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2021-09-02 15:16:50 Re: mark the timestamptz variant of date_bin() as stable
Previous Message Filip Gospodinov 2021-09-02 13:08:34 Re: Fix pkg-config file for static linking