Re: Reduce the time required for a database recovery from archive.

From: Dmitry Shulga <d(dot)shulga(at)postgrespro(dot)ru>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reduce the time required for a database recovery from archive.
Date: 2021-01-11 07:51:23
Message-ID: 4047CC05-1AF5-454B-850B-ED37374A2AC0@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Stephen

Based on our last discussion I redesigned the implementation of WAL archive recovery speed-up. The main idea of the new implementation was partly borrowed from your proposal, to be more accurate from the following one:

> On 9 Nov 2020, at 23:31, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>
> The relatively simple approach I was thinking was that a couple of
> workers would be started and they'd have some prefetch amount that needs
> to be kept out ahead of the applying process, which they could
> potentially calculate themselves without needing to be pushed forward by
> the applying process.
>
In the new implementation, several workers are spawned on server start up for delivering WAL segements from archive. The number of workers to spawn is specfied by the GUC parameter wal_prefetch_workers; the max. number of files to preload from the archive is determined by the GUC parameter wal_max_prefetch_amount. The applier of WAL records still handles WAL files one-by-one, but since several prefetching processes are loading files from the archive, there is a high probability that when the next WAL file is requested by the applier of WAL records, it has already been delivered from the archive.

Every time any of the running workers is going to preload the next WAL file, it checks whether a limit imposed by the parameter wal_max_prefetch_amount was reached. If it was, then the process suspends preloading until the WAL applier process handles some of the already preloaded WAL files and the total number of already loaded but not yet processed WAL files drops below this limit.

At the moment I didn't implement a mechanism for dynamic calculation of the number of workers required for loading the WAL files in time. We can consider current (simplified) implementation as a base for further discussion and turn to this matter in the next iteration if it be needed.

Also I would like to ask your opinion about the issue I'm thinking about:
Parallel workers spawned for preloading WAL files from archive use the original mechanism for delivering files from archive - they run a command specified by the GUC parameter restore_command. One of the possible parameters accepted by the restore_command is %r, which specifies the filename of the last restart point. If several workers preload WAL files simultaneously with another process applying the preloaded WAL files, I’m not sure what is correct way to determine the last restart point value that WAL-preloading processes should use, because this value can be updated at any time by the process that applies WALs.

Another issue that I would like to ask your opinion about regards to choosing correct value for a max size of the hash table stored in shared memory. Currently, wal_max_prefetch_amount is passed as the value for max. hash table size that I'm not sure is the best decision.

Thanks in advance for feedback.

Regards,
Dmitry

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message japin 2021-01-11 07:59:49 Re: Added schema level support for publication.
Previous Message Bharath Rupireddy 2021-01-11 07:46:09 Re: logical replication worker accesses catalogs in error context callback