Re: .ready and .done files considered harmful

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-05-04 08:07:48
Message-ID: 6E4EE5BE-AD9F-4391-84D4-1DC862059EBE@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 4 мая 2021 г., в 09:27, Andres Freund <andres(at)anarazel(dot)de> написал(а):
>
> Hi,
>
> On 2021-05-03 16:49:16 -0400, Robert Haas wrote:
>> I have two possible ideas for addressing this; perhaps other people
>> will have further suggestions. A relatively non-invasive fix would be
>> to teach pgarch.c how to increment a WAL file name. After archiving
>> segment N, check using stat() whether there's an .ready file for
>> segment N+1. If so, do that one next. If not, then fall back to
>> performing a full directory scan.
>
> Hm. I wonder if it'd not be better to determine multiple files to be
> archived in one readdir() pass?

FWIW we use both methods [0]. WAL-G has a pipe with WAL-push candidates.
We add there some predictions, and if it does not fill upload concurrency - list archive_status contents (concurrently to background uploads).

>
>
>> As far as I can see, this is just cheap insurance. If archiving is
>> keeping up, the extra stat() won't matter much. If it's not, this will
>> save more system calls than it costs. Since during normal operation it
>> shouldn't really be possible for files to show up in pg_wal out of
>> order, I don't really see a scenario where this changes the behavior,
>> either. If there are gaps in the sequence at startup time, this will
>> cope with it exactly the same as we do now, except with a better
>> chance of finishing before I retire.
>
> There's definitely gaps in practice :(. Due to the massive performance
> issues with archiving there are several tools that archive multiple
> files as part of one archive command invocation (and mark the additional
> archived files as .done immediately).
Interestingly, we used to rename .ready->.done some years ago. But pgBackRest developers convinced me that it's not a good idea to mess with data dir [1]. Then pg_probackup developers convinced me that renaming .ready->.done on our own scales better and implemented this functionality for us [2].

>> If we did that, could we just get rid of the .ready and .done files
>> altogether? Are they just a really expensive IPC mechanism to avoid a
>> shared memory connection, or is there some more fundamental reason why
>> we need them?
>
> What kind of shared memory mechanism are you thinking of? Due to
> timelines and history files I don't think simple position counters would
> be quite enough.
>
> I think the aforementioned "batching" archive commands are part of the
> problem :(.archiv
I'd be happy if we had a table with files that need to be archived, a table with registered archivers and a function to say "archiver number X has done its job on file Y". Archiver could listen to some archiver channel while sleeping or something like that.

Thanks!

Best regards, Andrey Borodin.

[0] https://github.com/x4m/wal-g/blob/c8a785217fe1123197280fd24254e51492bf5a68/internal/bguploader.go#L119-L137
[1] https://www.postgresql.org/message-id/flat/20180828200754.GI3326%40tamriel.snowman.net#0b07304710b9ce5244438b7199447ee7
[2] https://github.com/wal-g/wal-g/pull/950

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2021-05-04 08:17:49 Re: pg_upgrade can result in early wraparound on databases with high transaction load
Previous Message Bharath Rupireddy 2021-05-04 07:19:23 Re: Enhanced error message to include hint messages for redundant options error