Re: speed up a logical replica setup

From: Andres Freund <andres(at)anarazel(dot)de>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: speed up a logical replica setup
Date: 2022-02-21 23:28:49
Message-ID: 20220221232849.x6s24ete4eyg6jol@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-21 09:09:12 -0300, Euler Taveira wrote:
> Logical replication has been used to migration with minimal downtime. However,
> if you are dealing with a big database, the amount of required resources (disk
> -- due to WAL retention) increases as the backlog (WAL) increases. Unless you
> have a generous amount of resources and can wait for long period of time until
> the new replica catches up, creating a logical replica is impracticable on
> large databases.

Indeed.

> DESIGN
>
> The conversion requires 8 steps.
>
> 1. Check if the target data directory has the same system identifier than the
> source data directory.
> 2. Stop the target server if it is running as a standby server. (Modify
> recovery parameters requires a restart.)
> 3. Create one replication slot per specified database on the source server. One
> additional replication slot is created at the end to get the consistent LSN
> (This consistent LSN will be used as (a) a stopping point for the recovery
> process and (b) a starting point for the subscriptions).
> 4. Write recovery parameters into the target data directory and start the
> target server (Wait until the target server is promoted).
> 5. Create one publication (FOR ALL TABLES) per specified database on the source
> server.
> 6. Create one subscription per specified database on the target server (Use
> replication slot and publication created in a previous step. Don't enable the
> subscriptions yet).
> 7. Sets the replication progress to the consistent LSN that was got in a
> previous step.
> 8. Enable the subscription for each specified database on the target server.

I think the system identifier should also be changed, otherwise you can way
too easily get into situations trying to apply WAL from different systems to
each other. Not going to end well, obviously.

> This tool does not take a base backup. It can certainly be included later.
> There is already a tool do it: pg_basebackup.

It would make sense to allow to call pg_basebackup from the new tool. Perhaps
with a --pg-basebackup-parameters or such.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2022-02-22 00:35:17 Re: speed up a logical replica setup
Previous Message Euler Taveira 2022-02-21 23:16:33 Re: row filtering for logical replication