Re: .ready and .done files considered harmful

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-07-06 13:34:58
Message-ID: 20210706133458.GE20766@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Dipesh Pandit (dipesh(dot)pandit(at)gmail(dot)com) wrote:
> We have addressed the O(n^2) problem which involves directory scan for
> archiving individual WAL files by maintaining a WAL counter to identify
> the next WAL file in a sequence.

This seems to have missed the concerns raised in
https://postgr.es/m/20210505170601.GF20766@tamriel.snowman.net ..?

And also the comments immediately above the ones being added here:

> @@ -596,29 +606,55 @@ pgarch_archiveXlog(char *xlog)
> * larger ID; the net result being that past timelines are given higher
> * priority for archiving. This seems okay, or at least not obviously worth
> * changing.
> + *
> + * WAL files are generated in a specific order of log segment number. The
> + * directory scan for each WAL file can be minimized by identifying the next
> + * WAL file in the sequence. This can be achieved by maintaining log segment
> + * number and timeline ID corresponding to WAL file currently being archived.
> + * The log segment number of current WAL file can be incremented by '1' upon
> + * successful archival to point to the next WAL file.

specifically about history files being given higher priority for
archiving. If we go with this change then we'd at least want to rewrite
or remove those comments, but I don't actually agree that we should
remove that preference to archive history files ahead of WAL, for the
reasons brought up previously.

As was suggested on that subthread, it seems like it should be possible
to just track the current timeline and adjust what we're doing if the
timeline changes, and we should even know what the .history file is at
that point and likely don't even need to scan the directory for it, as
it'll be the old timeline ID.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2021-07-06 13:36:10 Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Previous Message Tom Lane 2021-07-06 13:24:28 Re: "debug_invalidate_system_caches_always" is too long