From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Euler Taveira' <euler(at)eulerto(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com> |
Subject: | RE: speed up a logical replica setup |
Date: | 2024-02-16 12:53:43 |
Message-ID: | TYCPR01MB12077756323B79042F29DDAEDF54C2@TYCPR01MB12077.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Euler,
Here are comments for v21.
01. main
```
/* rudimentary check for a data directory. */
...
/* subscriber PID file. */
```
Initial char must be upper, and period is not needed.
02. check_data_directory
```
snprintf(versionfile, MAXPGPATH, "%s/PG_VERSION", datadir);
```
You removed the version checking from PG_VERSION, but I think it is still needed.
Indeed v21 can detect the pg_ctl/pg_resetwal/pg_createsubscriber has different
verson, but this cannot ditect the started instance has the differnet version.
I.e., v20-0010 is partially needed.
03. store_pub_sub_info()
```
SimpleStringListCell *cell;
```
This definition can be in loop variable.
04. get_standby_sysid()
```
pfree(cf);
```
This can be pg_pfree().
05. check_subscriber
```
/* The target server must be a standby */
if (server_is_in_recovery(conn) == 0)
{
pg_log_error("The target server is not a standby");
return false;
}
```
What if the the function returns -1? Should we ditect (maybe the disconnection) here?
06. server_is_in_recovery
```
ret = strcmp("t", PQgetvalue(res, 0, 0));
PQclear(res);
if (ret == 0)
return 1;
else if (ret > 0)
return 0;
else
return -1; /* should not happen */
```
But strcmp may return a negative value, right? Based on the comment atop function,
we should not do it. I think we can use ternary operator instead.
07. server_is_in_recovery
As the fisrt place, no one consider this returns -1. So can we change the bool
function and raise pg_fatal() in case of the error?
08. check_subscriber
```
if (strcmp(PQgetvalue(res, 0, 1), "t") != 0)
{
pg_log_error("permission denied for function \"%s\"",
"pg_catalog.pg_replication_origin_advance(text, pg_lsn)");
return false;
}
```
I think the third argument must be 2.
09. check_subscriber
```
pg_log_debug("subscriber: primary_slot_name: %s", primary_slot_name);
```
The output seems strange if the primary_slot_name is not set.
10. setup_publisher()
```
PGconn *conn;
PGresult *res;
```
Definitions can be in the loop.
11. create_publication()
```
if (PQntuples(res) == 1)
{
/*
* If publication name already exists and puballtables is true, let's
* use it. A previous run of pg_createsubscriber must have created
* this publication. Bail out.
*/
```
Hmm, but pre-existing publications may not send INSERT/UPDATE/DELETE/TRUNCATE.
They should be checked if we really want to reuse.
(I think it is OK to just raise ERROR)
12. create_publication()
Based on above, we do not have to check before creating publicatios. The publisher
can detect the duplication. I prefer it.
13. create_logical_replication_slot()
```
if (PQresultStatus(res) != PGRES_TUPLES_OK)
{
pg_log_error("could not create replication slot \"%s\" on database \"%s\": %s",
slot_name, dbinfo->dbname,
PQresultErrorMessage(res));
return lsn;
}
```
I know lsn is always NULL, but can we use `return NULL`?
14. setup_subscriber()
```
PGconn *conn;
```
This definition can be in the loop.
15.
You said in case of failure, cleanups is not needed if the process exits soon [1].
But some functions call PQfinish() then exit(1) or pg_fatal(). Should we follow?
16.
Some places refer PGresult or PGConn even after the cleanup. They must be fixed.
```
PQclear(res);
disconnect_database(conn);
pg_fatal("could not get system identifier: %s",
PQresultErrorMessage(res));
```
I think this is a root cause why sometimes the wrong error message has output.
17.
Some places call PQerrorMessage() and other places call PQresultErrorMessage().
I think it PQerrorMessage() should be used only after the connection establishment
functions. Thought?
18. 041_pg_createsubscriber_standby.pl
```
use warnings;
```
We must set "FATAL = all";
19.
```
my $node_p;
my $node_f;
my $node_s;
my $node_c;
my $result;
my $slotname;
```
I could not find forward declarations in perl file.
The node name might be bit a consuging, but I could not find better name.
20.
```
# On node P
# - create databases
# - create test tables
# - insert a row
# - create a physical replication slot
$node_p->safe_psql(
'postgres', q(
CREATE DATABASE pg1;
CREATE DATABASE pg2;
));
$node_p->safe_psql('pg1', 'CREATE TABLE tbl1 (a text)');
$node_p->safe_psql('pg1', "INSERT INTO tbl1 VALUES('first row')");
$node_p->safe_psql('pg2', 'CREATE TABLE tbl2 (a text)');
my $slotname = 'physical_slot';
$node_p->safe_psql('pg2',
"SELECT pg_create_physical_replication_slot('$slotname')");
```
I think setting of the same node can be gathered into one place.
Also, any settings and definitions should be done just before they are used.
21.
```
$node_s->append_conf(
'postgresql.conf', qq[
log_min_messages = debug2
primary_slot_name = '$slotname'
]);
```
I could not find a reason why we set to debug2.
22.
```
command_fails
```
command_checks_all() can check returned value and outputs.
Should we use it?
23.
Can you add headers for each testcases? E.g.,
```
# ------------------------------
# Check pg_createsubscriber fails when the target server is not a
# standby of the source.
...
# ------------------------------
# Check pg_createsubscriber fails when the target server is not running
...
# ------------------------------
# Check pg_createsubscriber fails when the target server is a member of
# the cascading standby.
...
# ------------------------------
# Check successful dry-run
...
# ------------------------------
# Check successful conversion
```
24.
```
# Stop node C
$node_c->teardown_node;
...
$node_p->stop;
$node_s->stop;
$node_f->stop;
```
Why you choose the teardown?
25.
The creation of subscriptions are not directory tested. @subnames contains the
name of subscriptions, but it just assumes the number of them is more than two.
Since it may be useful, I will post top-up patch on Monday, if there are no updating.
[1]: https://www.postgresql.org/message-id/89ccf72b-59f9-4317-b9fd-1e6d20a0c3b1%40app.fastmail.com
[2]: https://www.postgresql.org/message-id/TYCPR01MB1207713BEC5C379A05D65E342F54B2%40TYCPR01MB12077.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2024-02-16 12:54:16 | Re: POC, WIP: OR-clause support for indexes |
Previous Message | Bharath Rupireddy | 2024-02-16 12:53:00 | Re: A proposal to provide a timeout option for CREATE_REPLICATION_SLOT/pg_create_logical_replication_slot |