Re: .ready and .done files considered harmful

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-09-01 23:44:15
Message-ID: 03826EAC-E5AE-40AB-B56B-8BEED1F5E534@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/25/21, 4:11 AM, "Dipesh Pandit" <dipesh(dot)pandit(at)gmail(dot)com> wrote:
> An alternate approach could be to force a directory scan at checkpoint to
> break the infinite wait for a .ready file which is being missed due to the
> fact that it is created out of order. This will make sure that the file
> gets archived within the checkpoint boundaries.

I think this is a good idea.

> Please find attached patch v11.

Thanks for the new version of the patch.

+ /*
+ * History files or a .ready file created out of order requires archiver to
+ * perform a full directory scan.
+ */
+ if (IsTLHistoryFileName(xlog) || IsBackupHistoryFileName(xlog) ||
+ fileOutOfOrder)
+ PgArchEnableDirScan();

I think we should force a directory scan for everything that isn't a
regular WAL file. IOW we can use !IsXLogFileName(xlog) instead of
enumerating all the different kinds of files we might want to archive.

+ /*
+ * Segment number of the most recent .ready file found by archiver,
+ * protected by WALArchiveLock.
+ */
+ XLogSegNo lastReadySegNo;
} PgArchData;

+/*
+ * Segment number and timeline ID to identify the next file in a WAL sequence
+ */
+typedef struct readyXLogState
+{
+ XLogSegNo lastSegNo;
+ TimeLineID lastTLI;
+} readyXLogState;

lastSegNo and lastReadySegNo appear to be the same thing. Couldn't we
just use the value in PgArchData?

+ return (curSegNo < lastSegNo) ? true : false;

I think this needs to be <=. If the two values are equal,
pgarch_readyXlog() may have just completed a directory scan and might
be just about to set PgArch->lastSegNo to a greater value.

+ LWLockAcquire(WALArchiveLock, LW_EXCLUSIVE);
+ PgArch->lastReadySegNo = segNo;
+ LWLockRelease(WALArchiveLock);

IMO we should just use a spinlock instead of introducing a new LWLock.
It looks like you really only need the lock for a couple of simple
functions. I still think protecting PgArch->dirScan with a spinlock
is a good idea, if for no other reason than it makes it easier to
reason about this logic.

+ if (stat(xlogready, &st) == 0)

I think we should ERROR if stat() fails for any other reason than
ENOENT.

+ ishistory = IsTLHistoryFileName(basename) ||
+ IsBackupHistoryFileName(basename);

I suspect we still want to prioritize timeline history files over
backup history files. TBH I find the logic below this point for
prioritizing history files to be difficult to follow, and I think we
should refactor it into some kind of archive priority comparator
function.

+ /*
+ * Reset the flag only when we found a regular WAL file to make
+ * sure that we are done with processing history files.
+ */
+ PgArch->dirScan = false;

I think we really want to unset dirScan before we start the directory
scan, and then we set it to true afterwards if we didn't find a
regular WAL file. If someone requests a directory scan in the middle
of an ongoing directory scan, we don't want to lose that request.

I attached two patches that demonstrate what I'm thinking this change
should look like. One is my take on the keep-trying-the-next-file
approach, and the other is a new version of the multiple-files-per-
readdir approach (with handling for "cheating" archive commands). I
personally feel that the multiple-files-per-readdir approach winds up
being a bit cleaner and more resilient than the keep-trying-the-next-
file approach. However, the keep-trying-the-next-file approach will
certainly be more efficient (especially for the extreme cases
discussed in this thread), and I don't have any concrete concerns with
this approach that seem impossible to handle.

Nathan

Attachment Content-Type Size
0001-keep-trying-the-next-file-approach.patch application/octet-stream 10.6 KB
0001-Improve-performance-of-pgarch_readyXlog-with-many-st.patch application/octet-stream 10.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-09-01 23:56:53 Re: Postgres Win32 build broken?
Previous Message Jaime Casanova 2021-09-01 23:39:17 Re: Numeric x^y for negative x