Re: .ready and .done files considered harmful

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-07-06 08:20:46
Message-ID: CAFiTN-udLLWy-Z4enoGF3dZwQqdRMxJmoB-Nn+vOP-2oQEn6Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 6, 2021 at 11:36 AM Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com> wrote:
>
> Hi,
>
> We have addressed the O(n^2) problem which involves directory scan for
> archiving individual WAL files by maintaining a WAL counter to identify
> the next WAL file in a sequence.
>
> WAL archiver scans the status directory to identify the next WAL file
> which needs to be archived. This directory scan can be minimized by
> maintaining the log segment number of the current file which is being archived
> and incrementing it by '1' to get the next WAL file in a sequence. Archiver
> can check the availability of the next file in status directory and in case if the
> file is not available then it should fall-back to directory scan to get the oldest
> WAL file.
>
> Please find attached patch v1.
>

I have a few suggestions on the patch
1.
+
+ /*
+ * Found the oldest WAL, reset timeline ID and log segment number to generate
+ * the next WAL file in the sequence.
+ */
+ if (found && !historyFound)
+ {
+ XLogFromFileName(xlog, &curFileTLI, &nextLogSegNo, wal_segment_size);
+ ereport(LOG,
+ (errmsg("directory scan to archive write-ahead log file \"%s\"",
+ xlog)));
+ }

If a history file is found we are not updating curFileTLI and
nextLogSegNo, so it will attempt the previously found segment. This
is fine because it will not find that segment and it will rescan the
directory. But I think we can do better, instead of searching the
same old segment in the previous timeline we can search that old
segment in the new TL so that if the TL switch happened within the
segment then we will find the segment and we will avoid the directory
search.

/*
+ * Log segment number and timeline ID to get next WAL file in a sequence.
+ */
+static XLogSegNo nextLogSegNo = 0;
+static TimeLineID curFileTLI = 0;
+

So everytime archiver will start with searching segno=0 in timeline=0.
Instead of doing this can't we first scan the directory and once we
get the first segment to archive then only we can start predicting the
next wal segment? I think there is nothing wrong even if we try to
look for seg 0 in timeline 0, everytime we start the archivar but that
will be true only once in the history of the cluster so why not skip
this until we scan the directory once?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-07-06 08:21:15 Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
Previous Message Michael Paquier 2021-07-06 08:07:53 Re: Can a child process detect postmaster death when in pg_usleep?