Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-03-21 04:19:10
Message-ID: d546c4bb-92d1-4e2d-898f-48234b12ed25@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 18, 2024, at 10:52 AM, Peter Eisentraut wrote:
> On 16.03.24 16:42, Euler Taveira wrote:
> > I'm attaching a new version (v30) that adds:
>
> I have some review comments and attached a patch with some smaller
> fixups (mainly message wording and avoid fixed-size string buffers).

Thanks for your review. I'm attaching a new version (v32) that includes your
fixups, merges the v30-0002 into the main patch [1], addresses Hayato's review[2],
your reviews [3][4], and fixes the query for set_replication_progress() [5].

> * doc/src/sgml/ref/pg_createsubscriber.sgml
>
> I would remove the "How It Works" section. This is not relevant to
> users, and it is very detailed and will require updating whenever the
> implementation changes. It could be a source code comment instead.

It uses the same structure as pg_rewind that also describes how it works
internally. I included a separate patch that completely removes the section.

> * src/bin/pg_basebackup/pg_createsubscriber.c
>
> I think the connection string handling is not robust against funny
> characters, like spaces, in database names etc.

get_base_conninfo() uses PQconninfoParse to parse the connection string. I
expect PQconnectdb to provide a suitable error message in this case. Even if it
builds keywords and values arrays, it is also susceptible to the same issue, no?

> Most SQL commands need to be amended for proper identifier or string
> literal quoting and/or escaping.

I completely forgot about this detail when I added the new options in v30. It is
fixed now. I also changed the tests to exercise it.

> In check_subscriber(): All these permissions checks seem problematic
> to me. We shouldn't reimplement our own copy of the server's
> permission checks. The server can check the permissions. And if the
> permission checking in the server ever changes, then we have
> inconsistencies to take care of. Also, the error messages "permission
> denied" are inappropriate, because we are not doing the actual thing.
> Maybe we want to do a dry-run for the benefit of the user, but then we
> should do the actual thing, like try to create a replication slot, or
> whatever. But I would rather just remove all this, it seems too
> problematic.

The main goal of the check_* functions are to minimize error during execution.
I removed the permission checks. The GUC checks were kept.

> In main(): The first check if the standby is running is problematic.
> I think it would be better to require that the standby is initially
> shut down. Consider, the standby might be running under systemd.
> This tool will try to stop it, systemd will try to restart it. Let's
> avoid these kinds of battles. It's also safer if we don't try to
> touch running servers.

That's a good point. I hadn't found an excuse to simplify this but you provided
one. :) There was a worry about ignoring some command-line options that changes
GUCs if the server was started. There was also an ugly case for dry run mode
that has to start the server (if it was running) at the end. Both cases are no
longer issues. The current code provides a suitable error if the target server
is running.

> The -p option (--subscriber-port) doesn't seem to do anything. In my
> testing, it always uses the compiled-in default port.

It works for me. See this snippet from the regression tests. The port (50945) is
used by pg_ctl.

# Running: pg_createsubscriber --verbose --verbose --pgdata /c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata --publisher-server port=50943 host=/tmp/qpngb0bPKo dbname='pg1' --socket-directory /tmp/qpngb0bPKo --subscriber-port 50945 --database pg1 --database pg2
pg_createsubscriber: validating connection string on publisher
.
.
pg_createsubscriber: pg_ctl command is: "/c/pg_createsubscriber/tmp_install/c/pg_createsubscriber/bin/pg_ctl" start -D "/c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata" -s -o "-p 50945" -o "-c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/tmp/qpngb0bPKo'"
2024-03-20 18:15:24.517 -03 [105195] LOG: starting PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-03-20 18:15:24.517 -03 [105195] LOG: listening on Unix socket "/tmp/qpngb0bPKo/.s.PGSQL.50945"

> Printing all the server log lines to the terminal doesn't seem very
> user-friendly. Not sure what to do about that, short of keeping a
> pg_upgrade-style directory of log files. But it's ugly.

I removed the previous implementation that creates a new directory and stores
the log file there. I don't like the pg_upgrade-style directory because (a) it
stores part of the server log files in another place and (b) it is another
directory to ignore if your tool handles the data directory (like a backup
tool). My last test said it prints 35 server log lines. I expect that the user
redirects the output to a file so he/she can inspect it later if required.

[1] https://www.postgresql.org/message-id/34637e7f-0330-420d-8f45-1d022962d2fe%40app.fastmail.com
[2] https://www.postgresql.org/message-id/TYCPR01MB12077E2ACB82680CD7BAA22F9F52D2%40TYCPR01MB12077.jpnprd01.prod.outlook.com
[3] https://www.postgresql.org/message-id/085203c4-580d-4c50-90e3-e47249e14585%40eisentraut.org
[4] https://www.postgresql.org/message-id/7a970912-0b77-4942-84f7-2c9ca0bc05a5%40eisentraut.org
[5] https://www.postgresql.org/message-id/CALDaNm053p5Drh9%2B%3DVvkH0Zf_3xoLB%2Bxw8RsNZjEwhc5xc_W%3DQ%40mail.gmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v32-0001-pg_createsubscriber-creates-a-new-logical-replic.patch.gz application/gzip 22.6 KB
v32-0002-Remove-How-it-Works-section.patch.gz application/gzip 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-03-21 04:30:30 Re: add AVX2 support to simd.h
Previous Message Bharath Rupireddy 2024-03-21 04:14:16 Re: New Table Access Methods for Multi and Single Inserts