Re: .ready and .done files considered harmful

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-09-20 20:42:26
Message-ID: 202109202042.uziogfq245yw@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-Sep-20, Robert Haas wrote:

> I was thinking that this might increase the number of directory scans
> by a pretty large amount when we repeatedly catch up, then 1 new file
> gets added, then we catch up, etc.

I was going to say that perhaps we can avoid repeated scans by having a
bitmap of future files that were found by a scan; so if we need to do
one scan, we keep track of the presence of the next (say) 64 files in
our timeline, and then we only have to do another scan when we need to
archive a file that wasn't present the last time we scanned. However:

> But I guess your thought process is that such directory scans, even if
> they happen many times per second, can't really be that expensive,
> since the directory can't have much in it. Which seems like a fair
> point. I wonder if there are any situations in which there's not much
> to archive but the archive_status directory still contains tons of
> files.

(If we take this stance, which seems reasonable to me, then we don't
need to optimize.) But perhaps we should complain if we find extraneous
files in archive_status -- Then it'd be on the users' heads not to leave
tons of files that would slow down the scan.

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-09-20 20:58:04 Re: psql: tab completion differs on semicolon placement
Previous Message Robert Haas 2021-09-20 20:25:09 Re: .ready and .done files considered harmful