Re: .ready and .done files considered harmful

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-07-23 21:46:37
Message-ID: 4CAB59F8-1EC3-4BA4-B97A-DE927D7D694F@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/6/21, 1:01 PM, "Andres Freund" <andres(at)anarazel(dot)de> wrote:
> If we leave history files and gaps in the .ready sequence aside for a
> second, we really only need an LSN or segment number describing the
> current "archive position". Then we can iterate over the segments
> between the "archive position" and the flush position (which we already
> know). Even if we needed to keep statting .ready/.done files (to handle
> gaps due to archive command mucking around with .ready/done), it'd still
> be a lot cheaper than what we do today. It probably would even still be
> cheaper if we just statted all potentially relevant timeline history
> files all the time to send them first.

My apologies for chiming in so late to this thread, but a similar idea
crossed my mind while working on a bug where .ready files get created
too early [0]. Specifically, instead of maintaining a status file per
WAL segment, I was thinking we could narrow it down to a couple of
files to keep track of the boundaries we care about:

1. earliest_done: the oldest segment that has been archived and
can be recycled/removed
2. latest_done: the newest segment that has been archived
3. latest_ready: the newest segment that is ready for archival

This might complicate matters for backup utilities that currently
modify the .ready/.done files, but it would simplify this archive
status stuff quite a bit and eliminate the need to worry about the
directory scans in the first place.

Nathan

[0] https://www.postgresql.org/message-id/flat/CBDDFA01-6E40-46BB-9F98-9340F4379505(at)amazon(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-23 21:47:27 Re: Followup Timestamp to timestamp with TZ conversion
Previous Message Mikhail Matrosov 2021-07-23 21:43:32 Re: Configure with thread sanitizer fails the thread test