Re: .ready and .done files considered harmful

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-08-23 13:42:12
Message-ID: CA+TgmoYqNF1A56TPAFY0pCxoM429gjqB0OnXs+swfubmNk+1CA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 22, 2021 at 10:31 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> I ran this again on a bigger machine with 200K WAL files pending
> archive. The v9 patch took ~5.5 minutes, the patch I sent took ~8
> minutes, and the existing logic took just under 3 hours.

Hmm. On the one hand, 8 minutes > 5.5 minutes, and presumably the gap
would only get wider if the number of files were larger or if reading
the directory were slower. I am pretty sure that reading the directory
must be much slower in some real deployments where this problem has
come up. On the other hand, 8.8 minutes << 3 hours, and your patch
would win if somehow we had a ton of gaps in the sequence of files.
I'm not sure how likely that is to be the cause - probably not very
likely at all if you aren't using an archive command that cheats, but
maybe really common if you are. Hmm, but I think if the
archive_command cheats by marking a bunch of files done when it is
tasked with archiving just one, your patch will break, because, unless
I'm missing something, it doesn't re-evaluate whether things have
changed on every pass through the loop as Dipesh's patch does. So I
guess I'm not quite sure I understand why you think this might be the
way to go?

Maintaining the binary heap in lowest-priority-first order is very
clever, and the patch does look quite elegant. I'm just not sure I
understand the point.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-08-23 14:05:23 RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Previous Message Robert Haas 2021-08-23 13:08:44 Re: replay of CREATE TABLESPACE eats data at wal_level=minimal