Re: speed up a logical replica setup

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "vignesh C" <vignesh21(at)gmail(dot)com>
Cc: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-02-14 03:26:59
Message-ID: 89ccf72b-59f9-4317-b9fd-1e6d20a0c3b1@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 9, 2024, at 3:27 AM, vignesh C wrote:
> On Wed, 7 Feb 2024 at 10:24, Euler Taveira <euler(at)eulerto(dot)com> wrote:
> >
> > On Fri, Feb 2, 2024, at 6:41 AM, Hayato Kuroda (Fujitsu) wrote:
> >
> > Thanks for updating the patch!
>
> Thanks for the updated patch, few comments:
> Few comments:
> 1) Cleanup function handler flag should be reset, i.e.
> dbinfo->made_replslot = false; should be there else there will be an
> error during drop replication slot cleanup in error flow:

Why? drop_replication_slot() is basically called by atexit().

> 2) Cleanup function handler flag should be reset, i.e.
> dbinfo->made_publication = false; should be there else there will be
> an error during drop publication cleanup in error flow:

Ditto. drop_publication() is basically called by atexit().

>
> 3) Cleanup function handler flag should be reset, i.e.
> dbinfo->made_subscription = false; should be there else there will be
> an error during drop publication cleanup in error flow:

Ditto. drop_subscription() is only called by atexit().

> 4) I was not sure if drop_publication is required here, as we will not
> create any publication in subscriber node:
> + if (dbinfo[i].made_subscription)
> + {
> + conn = connect_database(dbinfo[i].subconninfo);
> + if (conn != NULL)
> + {
> + drop_subscription(conn, &dbinfo[i]);
> + if (dbinfo[i].made_publication)
> + drop_publication(conn, &dbinfo[i]);
> + disconnect_database(conn);
> + }
> + }

setup_subscriber() explains the reason.

/*
* Since the publication was created before the consistent LSN, it is
* available on the subscriber when the physical replica is promoted.
* Remove publications from the subscriber because it has no use.
*/
drop_publication(conn, &dbinfo[i]);

I changed the referred code a bit because it is not reliable. Since
made_subscription was not set until we create the subscription, the
publications that were created on primary and replicated to standby won't be
removed on subscriber. Instead, it should rely on the recovery state to decide
if it should drop it.

> 5) The connection should be disconnected in case of error case:
> + /* secure search_path */
> + res = PQexec(conn, ALWAYS_SECURE_SEARCH_PATH_SQL);
> + if (PQresultStatus(res) != PGRES_TUPLES_OK)
> + {
> + pg_log_error("could not clear search_path: %s",
> PQresultErrorMessage(res));
> + return NULL;
> + }
> + PQclear(res);

connect_database() is usually followed by a NULL test and exit(1) if it cannot
connect. It should be added for correctness but it is not a requirement.

> 7) These includes are not required:
> 7.a) #include <signal.h>
> 7.b) #include <sys/stat.h>
> 7.c) #include <sys/wait.h>
> 7.d) #include "access/xlogdefs.h"
> 7.e) #include "catalog/pg_control.h"
> 7.f) #include "common/file_utils.h"
> 7.g) #include "utils/pidfile.h"

Good catch. I was about to review the include files.

> 8) POSTMASTER_STANDBY and POSTMASTER_FAILED are not being used, is it
> required or kept for future purpose:
> +enum WaitPMResult
> +{
> + POSTMASTER_READY,
> + POSTMASTER_STANDBY,
> + POSTMASTER_STILL_STARTING,
> + POSTMASTER_FAILED
> +};

I just copied verbatim from pg_ctl. We should remove the superfluous states.

> 9) pg_createsubscriber should be kept after pg_basebackup to maintain
> the consistent order:

Ok.

>
> 10) dry-run help message is not very clear, how about something
> similar to pg_upgrade's message like "check clusters only, don't
> change any data":

$ /tmp/pgdevel/bin/pg_archivecleanup --help | grep dry-run
-n, --dry-run dry run, show the names of the files that would be
$ /tmp/pgdevel/bin/pg_combinebackup --help | grep dry-run
-n, --dry-run don't actually do anything
$ /tmp/pgdevel/bin/pg_resetwal --help | grep dry-run
-n, --dry-run no update, just show what would be done
$ /tmp/pgdevel/bin/pg_rewind --help | grep dry-run
-n, --dry-run stop before modifying anything
$ /tmp/pgdevel/bin/pg_upgrade --help | grep check
-c, --check check clusters only, don't change any data

I used the same sentence as pg_rewind but I'm fine with pg_upgrade or
pg_combinebackup sentences.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-02-14 03:28:38 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Andrei Lepikhov 2024-02-14 03:21:41 Re: POC, WIP: OR-clause support for indexes