Re: speed up a logical replica setup

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-07 17:14:14
Message-ID: 6423dfeb-a729-45d3-b71e-7bf1b3adb0c9@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I decided to take a quick look on this patch today, to see how it works
and do some simple tests. I've only started to get familiar with it, so
I have only some comments / questions regarding usage, not on the code.
It's quite possible I didn't understand some finer points, or maybe it
was already discussed earlier in this very long thread, so please feel
free to push back or point me to the past discussion.

Also, some of this is rather opinionated, but considering I didn't see
this patch before, my opinions may easily be wrong ...

1) SGML docs

It seems the SGML docs are more about explaining how this works on the
inside, rather than how to use the tool. Maybe that's intentional, but
as someone who didn't work with pg_createsubscriber before I found it
confusing and not very helpful.

For example, the first half of the page is prerequisities+warning, and
sure those are useful details, but prerequisities are checked by the
tool (so I can't really miss this) and warnings go into a lot of details
about different places where things may go wrong. Sure, worth knowing
and including in the docs, but maybe not right at the beginning, before
I learn how to even run the tool?

Maybe that's just me, though. Also, I'm sure it's not the only part of
our docs like this. Perhaps it'd be good to reorganize the content a bit
to make the "how to use" stuff more prominent?

2) this is a bit vague

... pg_createsubscriber will check a few times if the connection has
been reestablished to stream the required WAL. After a few attempts, it
terminates with an error.

What does "a few times" mean, and how many is "a few attempts"? Seems
worth knowing when using this tool in environments where disconnections
can happen. Maybe this should be configurable?

3) publication info

For a while I was quite confused about which tables get replicated,
until I realized the publication is FOR ALL TABLES. But I learned that
from this thread, the docs say nothing about this. Surely that's an
important detail that should be mentioned?

4) Is FOR ALL TABLES a good idea?

I'm not sure FOR ALL TABLES is a good idea. Or said differently, I'm
sure it won't work for a number of use cases. I know large databases
it's common to create "work tables" (not necessarily temporary) as part
of a batch job, but there's no need to replicate those tables.

AFAIK that'd break this FOR ALL TABLES publication, because the tables
will qualify for replication, but won't be present on the subscriber. Or
did I miss something?

I do understand that FOR ALL TABLES is the simplest approach, and for v1
it may be an acceptable limitation, but maybe it'd be good to also
support restricting which tables should be replicated (e.g. blacklist or
whitelist based on table/schema name?).

BTW if I'm right and creating a table breaks the subscriber creation,
maybe it'd be good to explicitly mention that in the docs.

Note: I now realize this might fall under the warning about DDL, which
says this:

Executing DDL commands on the source server while running
pg_createsubscriber is not recommended. If the target server has
already been converted to logical replica, the DDL commands must
not be replicated so an error would occur.

But I find this confusing. Surely there are many DDL commands that have
absolutely no impact on logical replication (like creating an index or
view, various ALTER TABLE flavors, and so on). And running such DDL
certainly does not trigger error, right?

5) slot / publication / subscription name

I find it somewhat annoying it's not possible to specify names for
objects created by the tool - replication slots, publication and
subscriptions. If this is meant to be a replica running for a while,
after a while I'll have no idea what pg_createsubscriber_569853 or
pg_createsubscriber_459548_2348239 was meant for.

This is particularly annoying because renaming these objects later is
either not supported at all (e.g. for replication slots), or may be
quite difficult (e.g. publications).

I do realize there are challenges with custom names (say, if there are
multiple databases to replicate), but can't we support some simple
formatting with basic placeholders? So we could specify

--slot-name "myslot_%d_%p"

or something like that?

BTW what will happen if we convert multiple standbys? Can't they all get
the same slot name (they all have the same database OID, and I'm not
sure how much entropy the PID has)?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2024-03-07 17:19:01 DOCS: add helpful partitioning links
Previous Message Robert Haas 2024-03-07 17:02:38 Re: [PATCH] LockAcquireExtended improvement