From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
---|---|
To: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Euler Taveira' <euler(at)eulerto(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com> |
Subject: | Re: speed up a logical replica setup |
Date: | 2024-03-18 11:15:16 |
Message-ID: | 085203c4-580d-4c50-90e3-e47249e14585@eisentraut.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 18.03.24 06:43, Hayato Kuroda (Fujitsu) wrote:
> 02.
>
> "The main difference between the logical replication setup and pg_createsubscriber
> is the initial data copy."
>
> Grammarly suggests:
> "The initial data copy is the main difference between the logical replication
> setup and pg_createsubscriber."
I think that change is worse.
> 09.
>
> "The source server must accept connections from the target server. The source server must not be in recovery."
>
> Grammarly suggests:
> "The source server must accept connections from the target server and not be in recovery."
I think the previous version is better.
> 17.
>
> "specifies promote"
>
> We can do double-quote for the word promote.
The v30 patch has <literal>promote</literal>, which I think is adequate.
> 19.
>
> "is accepting read-write transactions"
>
> Grammarly suggests:
> "accepts read-write transactions"
I like the first one better.
> 20.
>
> New options must be also documented as well. This helps not only users but also
> reviewers.
> (Sometimes we cannot identify that the implementation is intentinal or not.)
Which ones are missing?
> 21.
>
> Also, not sure the specification is good. I preferred to specify them by format
> string. Because it can reduce the number of arguments and I cannot find use cases
> which user want to control the name of objects.
>
> However, your approach has a benefit which users can easily identify the generated
> objects by pg_createsubscriber. How do other think?
I think listing them explicitly is better for the first version. It's
simpler to implement and more flexible.
> 22.
>
> ```
> #define BASE_OUTPUT_DIR "pg_createsubscriber_output.d"
> ```
>
> No one refers the define.
This is gone in v30.
> 23.
>
> ```
> } CreateSubscriberOptions;
> ...
> } LogicalRepInfo;
> ```
>
> Declarations after the "{" are not needed, because we do not do typedef.
Yeah, this is actually wrong, because as it is written now, it defines
global variables.
> 22.
>
> While seeing definitions of functions, I found that some pointers are declared
> as const, but others are not. E.g., "char *lsn" in setup_recovery() won' be
> changed but not the constant. Is it just missing or is there another rule?
Yes, more could be done here. I have attached a patch for this. (This
also requires the just-committed 48018f1d8c.)
> 24.
>
> ```
> /* standby / subscriber data directory */
> static char *subscriber_dir = NULL;
> ```
>
> It is bit strange that only subscriber_dir is a global variable. Caller requires
> the CreateSubscriberOptions as an argument, except cleanup_objects_atexit() and
> main. So, how about makeing `CreateSubscriberOptions opt` to global one?
Fewer global variables seem preferable. Only make global as needed.
> 30.
>
> ```
> if (num_replslots == 0)
> dbinfo[i].replslotname = pg_strdup(genname);
> ```
>
> I think the straightforward way is to use the name of subscription if no name
> is specified. This follows the rule for CREATE SUBSCRIPTION.
right
> 31.
>
> ```
> /* Create replication slot on publisher */
> if (lsn)
> pg_free(lsn);
> ```
>
> I think allocating/freeing memory is not so efficient.
> Can we add a flag to create_logical_replication_slot() for controlling the
> returning value (NULL or duplicated string)? We can use the condition (i == num_dbs-1)
> as flag.
Nothing is even using the return value of
create_logical_replication_slot(). I think this can be removed altogether.
> 34.
>
> ```
> {"config-file", required_argument, NULL, 1},
> {"publication", required_argument, NULL, 2},
> {"replication-slot", required_argument, NULL, 3},
> {"subscription", required_argument, NULL, 4},
> ```
>
> The ordering looks strange for me. According to pg_upgarade and pg_basebackup,
> options which do not have short notation are listed behind.
>
> 35.
>
> ```
> opt.sub_port = palloc(16);
> ```
>
> Per other lines, pg_alloc() should be used.
Even better psprintf().
> 37.
>
> ```
> /* Register a function to clean up objects in case of failure */
> atexit(cleanup_objects_atexit);
> ```
>
> Sorry if we have already discussed. I think the registration can be moved just
> before the boot of the standby. Before that, the callback will be no-op.
But it can also stay where it is. What is the advantage of moving it later?
> 40.
>
> ```
> * XXX this code was extracted from BootStrapXLOG().
> ```
>
> So, can we extract the common part to somewhere? Since system identifier is related
> with the controldata file, I think it can be located in controldata_util.c.
Let's leave it as is for this PG release.
Attachment | Content-Type | Size |
---|---|---|
0001-fixup-pg_createsubscriber-creates-a-new-logical-repl.patch | text/plain | 8.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2024-03-18 11:24:47 | Re: Add pg_basetype() function to obtain a DOMAIN base type |
Previous Message | Ashutosh Bapat | 2024-03-18 11:11:27 | Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning |