Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-16 15:42:51
Message-ID: 34637e7f-0330-420d-8f45-1d022962d2fe@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 15, 2024, at 3:34 AM, Amit Kapila wrote:
> Did you consider adding options for publication/subscription/slot
> names as mentioned in my previous email? As discussed in a few emails
> above, it would be quite confusing for users to identify the logical
> replication objects once the standby is converted to subscriber.

Yes. I was wondering to implement after v1 is pushed. I started to write a code
for it but I wasn't sure about the UI. The best approach I came up with was
multiple options in the same order. (I don't provide short options to avoid
possible portability issues with the order.) It means if I have 3 databases and
the following command-line:

pg_createsubscriber ... --database pg1 --database pg2 --database3 --publication
pubx --publication puby --publication pubz

pubx, puby and pubz are created in the database pg1, pg2, and pg3 respectively.

> It seems we care only for publications created on the primary. Isn't
> it possible that some of the publications have been replicated to
> standby by that time, for example, in case failure happens after
> creating a few publications? IIUC, we don't care for standby cleanup
> after failure because it can't be used for streaming replication
> anymore. So, the only choice the user has is to recreate the standby
> and start the pg_createsubscriber again. This sounds questionable to
> me as to whether users would like this behavior. Does anyone else have
> an opinion on this point?

If it happens after creating a publication and before promotion, the cleanup
routine will drop the publications on primary and it will eventually be applied
to the standby via replication later.

> I see the below note in the patch:
> + If <application>pg_createsubscriber</application> fails while processing,
> + then the data directory is likely not in a state that can be recovered. It
> + is true if the target server was promoted. In such a case, creating a new
> + standby server is recommended.
>
> By reading this it is not completely clear whether the standby is not
> recoverable in case of any error or only an error after the target
> server is promoted. If others agree with this behavior then we should
> write the detailed reason for this somewhere in the comments as well
> unless it is already explained.

I rewrote the sentence to make it clear that only if the server is promoted,
the target server will be in a state that cannot be reused. It provides a
message saying it too.

pg_createsubscriber: target server reached the consistent state
pg_createsubscriber: hint: If pg_createsubscriber fails after this point, you
must recreate the physical replica before continuing.

I'm attaching a new version (v30) that adds:

* 3 new options (--publication, --subscription, --replication-slot) to assign
names to the objects. The --database option used to ignore duplicate names,
however, since these new options rely on the number of database options to
match the number of object name options, it is forbidden from now on. The
duplication is also forbidden for the object names to avoid errors earlier.
* rewrite the paragraph related to unusuable target server after
pg_createsubscriber fails.
* Vignesh reported an issue [1] related to reaching the recovery stop point
before the consistent state is reached. I proposed a simple patch that fixes
the issue.

[1] https://www.postgresql.org/message-id/CALDaNm3VMOi0GugGvhk3motghaFRKSWMCSE2t3YX1e%2BMttToxg%40mail.gmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v30-0001-pg_createsubscriber-creates-a-new-logical-replic.patch.gz application/gzip 22.5 KB
v30-0002-Stop-the-target-server-earlier.patch.gz application/gzip 887 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2024-03-16 15:46:07 Re: speed up a logical replica setup
Previous Message Rahul Uniyal 2024-03-16 15:10:29 Java : Postgres double precession issue with different data format text and binary