Re: .ready and .done files considered harmful

From: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-07-06 14:48:47
Message-ID: CAOgcT0MCTG4OzKGeBPi8J7YXwKq5L9i9E+EKWTcF=_KSKL1sbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I have a few suggestions on the patch
> 1.
> +
> + /*
> + * Found the oldest WAL, reset timeline ID and log segment number to
> generate
> + * the next WAL file in the sequence.
> + */
> + if (found && !historyFound)
> + {
> + XLogFromFileName(xlog, &curFileTLI, &nextLogSegNo, wal_segment_size);
> + ereport(LOG,
> + (errmsg("directory scan to archive write-ahead log file \"%s\"",
> + xlog)));
> + }
>
> If a history file is found we are not updating curFileTLI and
> nextLogSegNo, so it will attempt the previously found segment. This
> is fine because it will not find that segment and it will rescan the
> directory. But I think we can do better, instead of searching the
> same old segment in the previous timeline we can search that old
> segment in the new TL so that if the TL switch happened within the
> segment then we will find the segment and we will avoid the directory
> search.
>
>
> /*
> + * Log segment number and timeline ID to get next WAL file in a sequence.
> + */
> +static XLogSegNo nextLogSegNo = 0;
> +static TimeLineID curFileTLI = 0;
> +
>
> So everytime archiver will start with searching segno=0 in timeline=0.
> Instead of doing this can't we first scan the directory and once we
> get the first segment to archive then only we can start predicting the
> next wal segment? I think there is nothing wrong even if we try to
> look for seg 0 in timeline 0, everytime we start the archivar but that
> will be true only once in the history of the cluster so why not skip
> this until we scan the directory once?
>

+1, I like Dilip's ideas here to optimize further.

Also, one minor comment:

+ /*
+ * Log segment number already points to the next file in the sequence

+ * (as part of successful archival of the previous file). Generate the
path
+ * for status file.

+ */

This comment is a bit confusing with the name of the variable nextLogSegNo.
I think the name of the variable is appropriate here, but maybe we can
reword
the comment something like:

+ /*
+ * We already have the next anticipated log segment number and the
+ * timeline, check if this WAL file is ready to be archived. If
yes, skip
+ * the directory scan.
+ */

Regards,
Jeevan Ladhe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-06 14:54:43 Re: [PATCH] expand the units that pg_size_pretty supports on output
Previous Message David Christensen 2021-07-06 14:46:27 Re: [PATCH] expand the units that pg_size_pretty supports on output