Re: parallelizing the archiver

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelizing the archiver
Date: 2021-10-01 19:07:49
Message-ID: 14DD775A-A25D-47E8-84F8-B41DE3A0C9CA@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 30 сент. 2021 г., в 09:47, Bossart, Nathan <bossartn(at)amazon(dot)com> написал(а):
>
> The attached patch is a first try at adding alternatives for
> archive_command
Looks like an interesting alternative design.

> I tested the sample archive_command in the docs against the sample
> archive_library implementation in the patch, and I saw about a 50%
> speedup. (The archive_library actually syncs the files to disk, too.)
> This is similar to the improvement from batching.
Why test sample agains sample? I think if one tests this agains real archive tool doing archive_status lookup and ready->done renaming results will be much different.

> Of course, there are drawbacks to using an extension. Besides the
> obvious added complexity of building an extension in C versus writing
> a shell command, the patch disallows changing the libraries without
> restarting the server. Also, the patch makes no effort to simplify
> error handling, memory management, etc. This is left as an exercise
> for the extension author.
I think the real problem with extension is quite different than mentioned above.
There are many archive tools that already feature parallel archiving. PgBackrest, wal-e, wal-g, pg_probackup, pghoard, pgbarman and others. These tools by far outweight tools that don't look into archive_status to parallelize archival.
And we are going to ask them: add also a C extension without any feasible benefit to the user. You only get some restrictions like system restart to enable shared library.

I think we need a design that legalises already existing de-facto standard features in archive tools. Or event better - enables these tools to be more efficient, reliable etc. Either way we will create legacy code from the scratch.

Thanks!

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Zhang 2021-10-01 19:23:42 Re: Fix uninitialized variable access (src/backend/utils/mmgr/freepage.c)
Previous Message Daniel Gustafsson 2021-10-01 18:29:08 Re: 2021-09 Commitfest