Re: Support logical replication of DDLs

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2023-04-19 08:27:15
Message-ID: CAHut+PuG8J8uA5V-F-o4TczhvFSWGG1B8qL+EZO0HjWWEEYG+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Here are some more WIP review comments for the patch 0002-2023_04_07-2

This is a WIP review in parts because the patch was quite large, so it
is taking a while...

WIP part 1 [1] was posted 17/4.
WIP part 2 [2] was posted 17/4.

This is WIP part 3

======
doc/src/sgml/logical-replication.sgml

99.
+ <table id="ddl-replication-by-command-tag">
+ <title>DDL Replication Support by Command Tag</title>

This table is excessively long. I was thinking it might present the
information more simply just by showing the interesting rows that DO
support the replication, and have one final table row called "All
other commands" that do NOT support the DDL replication.

======
.../access/rmgrdesc/logicalddlmsgdesc.c

1.
+/*-------------------------------------------------------------------------
+ *
+ * logicalddlmsgdesc.c
+ * rmgr descriptor routines for replication/logical/ddlmessage.c
+ *
+ * Portions Copyright (c) 2015-2022, PostgreSQL Global Development Group

~

Copyright should say 2023 (same as logicalmsgdesc.c).

~~

2.
void
logicalddlmsg_desc(StringInfo buf, XLogReaderState *record)
{
char *rec = XLogRecGetData(record);
uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;

if (info == XLOG_LOGICAL_DDL_MESSAGE)
{
xl_logical_ddl_message *xlrec = (xl_logical_ddl_message *) rec;
char *prefix = xlrec->message;
char *message = xlrec->message + xlrec->prefix_size;
char *sep = "";

Assert(prefix[xlrec->prefix_size] != '\0');

~

Something is a bit fishy with this Assert. See ddlmessage.h the
comment says that the prefix size inclide the \0.

So a prefix of "abc" and a payload of "ppp" would
- Have a prefix_size 4
- Have a prefix + message like "abc\0ppp"

So IMO this Assert would made more sense written same as it was in the
file logicalmsg.c
Assert(prefix[xlrec->prefix_size - 1] == '\0');

And, if you also wanted to assert that there is some payload present
then IMO you can do that better like:
Assert(xlrec->message_size && *message);

~~

3. logicalddlmsg_identify

+const char *
+logicalddlmsg_identify(uint8 info)
+{
+ if ((info & ~XLR_INFO_MASK) == XLOG_LOGICAL_DDL_MESSAGE)
+ return "DDL";
+
+ return NULL;
+}

I suspect there might be some inconsistencies. IIRC there were some
other parts of the code (from my previous WIP reviews) that refer to
these as "DDL MESSAGE", not just "DDL". I’m not sure which of those
names you want, but I think it is better that they are all consistent
no matter which code is naming them.

======
src/backend/commands/publicationcmds.c

4. parse_publication_options

parse_publication_options(ParseState *pstate,
List *options,
+ bool for_all_tables,
bool *publish_given,

Why is there a new 'for_all_tables' parameter here, when nobody seems
to be using it?

~~~

5. parse_publication_options

- publish = defGetString(defel);
+ publish = pstrdup(defGetString(defel));

Is this a bug fix? It seems unrelated to the current patch.

~~~

6. parse_publication_options

+ char *ddl_level;
+ List *ddl_list;
+ ListCell *lc3;

IMO these are not good variable names.

SUGGESTIONS
ddl_level --> ddl_types (because you called the parameter
'ddl_type_given' not 'ddl_level_given')
ddl_list --> ddl_type_list
lc3 --> lc (because why lc3? it is not even in the same scope as the
lc2 which I think was also a poor name)

~~~

7. parse_publication_options

+ *ddl_type_given = true;
+ ddl_level = defGetString(defel);

It is curious that this patch added a strdup() for the similar code in
the 'publish' option code, but do not do so here (??)

~~~

8. parse_publication_options

+ foreach(lc3, ddl_list)
+ {
+ char *publish_opt = (char *) lfirst(lc3);
+
+ if (strcmp(publish_opt, "table") == 0)
+ pubactions->pubddl_table = true;
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unrecognized \"ddl\" value: \"%s\"", publish_opt));
+ }

Looks like a cut/paste of a (wrong) variable name from the similar
previous 'publish' option loop. IMO the "publish_opt" here should be
called something like "ddl_opt".

~~~

9. GetTransformWhereClauses

+/*
+ * Helper function to tranform a where clause.
+ *
+ * Also check the publication row filter expression and throw an error if
+ * anything not permitted or unexpected is encountered.
+ */
+static Node *
+GetTransformWhereClauses(const char *queryString, Relation relation,
+ Node *whereClause, bool check_expr)

9a.
AFAICT this is a code refactoring just to make the caller
(TransformPubWhereClauses) simpler by moving some inline code to a
separate static function. But, I did not see how this refactoring
should be part of this patch.

~

9b.
SUGGESTION (fix typo and change case)
Helper function to transform a WHERE clause.

~~~

10. CreateDDLReplicaEventTrigger

+/*
+ * Helper function to create a event trigger for DDL replication.
+ */
+static void
+CreateDDLReplicaEventTrigger(char *eventname, List *commands, Oid puboid)

"a event trigger" --> "an event trigger"

~~~

11. CreateDDLReplicaEventTrigger

+ static const char *trigger_func_prefix = "publication_deparse_%s";
+
+ ddl_trigger = makeNode(CreateEventTrigStmt);
+
+ snprintf(trigger_name, sizeof(trigger_name), PUB_EVENT_TRIG_PREFIX,
+ eventname, puboid);
+ snprintf(trigger_func_name, sizeof(trigger_func_name), trigger_func_prefix,
+ eventname);

I thought the PUB_EVENT_TRIG_PREFIX was a badly named constant because
this is really a format string, not a prefix. I think this was already
mentioned in a previous review. IIUC it is the format string of the
prefix so maybe it should be called something like
PUB_EVENT_TRIG_PREFIX_FMTSTR.

Similarly, that 'trigger_func_prefix' is also a format string for the
trigger function, so I didn't know why here it is called a "prefix".

~~~

12.
+ /*
+ * Register the event triggers as internally dependent on the publication.
+ */

/triggers/trigger/

~~~

13. CreateDDLReplicaEventTriggers

+static void
+CreateDDLReplicaEventTriggers(PublicationActions pubactions, Oid puboid)
+{
+ List *start_commands = NIL;
+ List *rewrite_commands = NIL;
+ List *init_commands = NIL;
+ List *end_commands = NIL;
+
+ if (pubactions.pubddl_table)
+ {
+ start_commands = lappend_int(start_commands, CMDTAG_DROP_TABLE);
+ rewrite_commands = lappend_int(rewrite_commands, CMDTAG_ALTER_TABLE);
+
+ init_commands = lappend_int(init_commands, CMDTAG_CREATE_TABLE_AS);
+ init_commands = lappend_int(init_commands, CMDTAG_SELECT_INTO);
+
+ end_commands = lappend_int(end_commands, CMDTAG_CREATE_TABLE);
+ end_commands = lappend_int(end_commands, CMDTAG_ALTER_TABLE);
+ end_commands = lappend_int(end_commands, CMDTAG_DROP_TABLE);
+ }
+
+ /* Create the ddl_command_end event trigger */
+ if (end_commands != NIL)
+ CreateDDLReplicaEventTrigger(PUB_TRIG_EVENT1, end_commands, puboid);
+
+ /* Create the ddl_command_start event trigger */
+ if (start_commands != NIL)
+ CreateDDLReplicaEventTrigger(PUB_TRIG_EVENT2, start_commands, puboid);
+
+ /* Create the table_rewrite event trigger */
+ if (rewrite_commands != NIL)
+ CreateDDLReplicaEventTrigger(PUB_TRIG_EVENT3, rewrite_commands, puboid);
+
+ /* Create the table_init_write event trigger */
+ if (init_commands != NIL)
+ CreateDDLReplicaEventTrigger(PUB_TRIG_EVENT4, init_commands, puboid);
+}

13a
IMO all the code (the variables, the assignments, and the checks)
should be rearranged into a more natural order of
init-start-rewrite-end.

~

13b.
I think this can have a quick exit if there is no 'ddl' option
defined. I think this would be better than ploughing on to do all the
list-checking which is all going to be NIL anyway.

~~~

14. DropDDLReplicaEventTrigger

+/*
+ * Helper function to drop a event trigger for DDL replication.
+ */
+static void
+DropDDLReplicaEventTrigger(char *event, Oid puboid)

14a.
"a event trigger" --> "an event trigger"

~

14b.
In other helper functions the event name was called 'eventname', not
'event'. Either is fine but IMO the naming should be consistent.

~~~

15. CreatePublication

+ if (pubactions.pubddl_table)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add table to publication \"%s\" if DDL replication is enabled",
+ stmt->pubname));

Maybe it is OK, but this seemed a strange message. I thought it should
say something more similar to what the AlterPublicationOptions message
looks like

e.g.
"DDL replication is only supported in FOR ALL TABLES or FOR TABLES IN
SCHEMA publications"

~~~

16.
pubform = (Form_pg_publication) GETSTRUCT(tup);

parse_publication_options(pstate,
stmt->options,
pubform->puballtables,
&publish_given, &pubactions,
&publish_via_partition_root_given,
&publish_via_partition_root,
&ddl_type_given);

pubform = (Form_pg_publication) GETSTRUCT(tup);
~

16a.
pubform is assigned 2x ???

~

16b.
Why are we passing puballtables? IIUC this parameter is not even used
in the called function.

======
src/backend/replication/logical/ddlmessage.c

17.

+ * ddlmessage.c
+ * Logical DDL messages.
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/replication/logical/ddlmessage.c
+ *
+ * NOTES
+ *
+ * Logical DDL messages allow XLOG logging of DDL command strings that
+ * get passed to the logical decoding plugin. In normal XLOG processing they
+ * are same as NOOP.
+ *
+ * Unlike generic logical messages, these DDL messages have only transactional
+ * mode. Note by default DDLs in PostgreSQL are transactional.
+ *
+ * These messages are part of current transaction and will be sent to
+ * decoding plugin using in a same way as DML operations.
+ *
+ * Every message carries prefix to avoid conflicts between different decoding
+ * plugins. The plugin authors must take extra care to use unique prefix,
+ * good options seems to be for example to use the name of the extension.

17a.
The copyright should say 2023

~

17b.
Some rewording of this part is needed: "...and will be sent to
decoding plugin using in a same way as DML operations."

~

17c
+ * Every message carries prefix to avoid conflicts between different decoding
+ * plugins. The plugin authors must take extra care to use unique prefix,
+ * good options seems to be for example to use the name of the extension.

SUGGESTION
Every message includes a prefix to avoid conflicts between different
decoding plugins. Plugin authors must take special care to use a
unique prefix (e.g. one idea is to include the name of the extension).

~~~

18. logicalddlmsg_redo

+/*
+ * Redo is basically just noop for logical decoding ddl messages.
+ */
+void
+logicalddlmsg_redo(XLogReaderState *record)
+{

"ddl messages" --> "DDL messages"

======
src/backend/replication/logical/logical.c

19. StartupDecodingContext

There are some existing comments that mention the 'message' callbacks.

Here, for 'message':

/*
* To support streaming, we require start/stop/abort/commit/change
* callbacks. The message and truncate callbacks are optional, similar to
* regular output plugins. We however enable streaming when at least one
* of the methods is enabled so that we can easily identify missing
* methods.
*
* We decide it here, but only check it later in the wrappers.
*/

And here, for 'stream_message':

/*
* streaming callbacks
*
* stream_message and stream_truncate callbacks are optional, so we do not
* fail with ERROR when missing, but the wrappers simply do nothing. We
* must set the ReorderBuffer callbacks to something, otherwise the calls
* from there will crash (we don't want to move the checks there).
*/

~

Do those comments need updating now to also mention the DDL message callbacks?

~~~

20. ddl_cb_wrapper

+ /* set output state */
+ ctx->accept_writes = true;
+ ctx->write_xid = txn != NULL ? txn->xid : InvalidTransactionId;
+ ctx->write_location = message_lsn;

The ddlmessage.c header comment said "DDL messages have only
transactional mode.", so does that mean the 'txn' here must never be
NULL? Where is that check happening? SHould it be Asserted right here?

~~~

21. stream_ddl_cb_wrapper

Same question as above review comment #20

======
.../replication/logical/reorderbuffer.c

22. ReorderBufferSerializeChange

+ case REORDER_BUFFER_CHANGE_DDL:
+ {
+ char *data;
+ Size prefix_size = strlen(change->data.ddl.prefix) + 1;
+
+ sz += prefix_size + change->data.ddl.message_size +
+ sizeof(Size) + sizeof(Oid) + sizeof(int) + sizeof(Size);
+ ReorderBufferSerializeReserve(rb, sz);
+
+ data = ((char *) rb->outbuf) + sizeof(ReorderBufferDiskChange);
+
+ /* might have been reallocated above */
+ ondisk = (ReorderBufferDiskChange *) rb->outbuf;
+
+ /* write the prefix, relid and cmdtype including the size */
+ memcpy(data, &prefix_size, sizeof(Size));
+ data += sizeof(Size);
+ memcpy(data, &change->data.ddl.relid, sizeof(Oid));
+ data += sizeof(Oid);
+ memcpy(data, &change->data.ddl.cmdtype, sizeof(DeparsedCommandType));
+ data += sizeof(int);
+ memcpy(data, change->data.ddl.prefix, prefix_size);
+ data += prefix_size;

Isn't it better to use sizeof(DeparsedCommandType) instead of
sizeof(int) in those 2 places in this code fragment?

~~~

23. ReorderBufferRestoreChange

+ memcpy(&change->data.ddl.cmdtype, data, sizeof(DeparsedCommandType));
+ data += sizeof(int);

Similar to the previous review comment, I think here it would be
better to write code like data += sizeof(DeparsedCommandType);

------
[1] https://www.postgresql.org/message-id/CAHut%2BPtzpuuRFrLnjkQePq296ip_0WfmQ4CtydM9JDR6gEf%3DQw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAHut%2BPtMkVoweJrd%3DSLw7BfpW883skasdnemoj4N19NnyjrT3Q%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dominique Devienne 2023-04-19 09:55:08 COPY RETURNING?
Previous Message Federico 2023-04-19 07:29:08 Re: Guidance on INSERT RETURNING order

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-04-19 08:47:47 Re: Should we put command options in alphabetical order in the doc?
Previous Message John Naylor 2023-04-19 07:02:14 Re: [PoC] Improve dead tuple storage for lazy vacuum