RE: Support logical replication of DDLs

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2023-05-02 02:23:08
Message-ID: OS0PR01MB571690B2DB46B1C4AF61D184946F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Friday, April 21, 2023 8:26 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:

Thanks for the comments. To avoid making the email too long,
only replied the comments that has not been addressed.

> ======
> src/backend/tcop/cmdtag.c
>
> 5. GetCommandTagsForDDLRepl
>
> +CommandTag *
> +GetCommandTagsForDDLRepl(int *ncommands) { CommandTag
> +*ddlrepl_commands = palloc0(COMMAND_TAG_NEXTTAG *
> sizeof(CommandTag));
> + *ncommands = 0;
> +
> + for(int i = 0; i < COMMAND_TAG_NEXTTAG; i++) { if
> + (tag_behavior[i].ddl_replication_ok)
> + ddlrepl_commands[(*ncommands)++] = (CommandTag) i; }
> +
> + return ddlrepl_commands;
> +}
>
> 5a.
> I felt that maybe it would be better to iterate using CommandTag enums
> instead of int indexes.
>
> ~
>
> 5b.
> I saw there is another code fragment in GetCommandTagEnum() that uses
> lengthof(tag_behaviour) instead of checking the COMMAND_TAG_NEXTTAG.
>
> It might be more consistent to use similar code here too. Something like:
>
> const int ntags = lengthof(tag_behavior) - 1; CommandTag
> *ddlrepl_commands = palloc0(ntags * sizeof(CommandTag)); *ncommands =
> 0;
>
> for(CommandTag tag = 0; i < nTags; tag++)
> if (tag_behavior[tag].ddl_replication_ok)
> ddlrepl_commands[(*ncommands)++] = tag;
>

Didn't modify this. I think ntags is one less than what we're expecting.

> ======
> src/bin/pg_dump/pg_dump.c
>
> 6.
> @@ -4213,7 +4224,10 @@ dumpPublication(Archive *fout, const
> PublicationInfo *pubinfo)
> first = false;
> }
>
> - appendPQExpBufferChar(query, '\'');
> + appendPQExpBufferStr(query, "'");
> +
> + if (pubinfo->pubddl_table)
> + appendPQExpBufferStr(query, ", ddl = 'table'");
>
> The change from appendPQExpBufferChar to appendPQExpBufferStr did not
> seem a necessary part of this patch.
>

Not part of patch now.

> ~~~
>
> 7. getPublicationEventTriggers
>
> +/*
> + * getPublicationEventTriggers
> + * get the publication event triggers that should be skipped
> + */
> +static void
> +getPublicationEventTriggers(Archive *fout, SimpleStringList
> +*skipTriggers)
>
> Given the way this function is invoked, it might be more appropriate
> to name it like getEventTriggersToBeSkipped(), with the comment saying
> that for now we just we skip the PUBLICATION DDL event triggers.
>

code no longer part of patch.

> ~~~
>
> 8. getEventTriggers
>
> /* Decide whether we want to dump it */
> - selectDumpableObject(&(evtinfo[i].dobj), fout);
> + if (simple_string_list_member(&skipTriggers, evtinfo[i].evtname))
> + evtinfo[i].dobj.dump= DUMP_COMPONENT_NONE; else
> + selectDumpableObject(&(evtinfo[i].dobj), fout);
> }
>
> + simple_string_list_destroy(&skipTriggers);
> +
>
> 8a.
> Missing whitespace before '='
>
> ~
>
> 8b.
> Scanning a list within a loop may not be efficient. I suppose this
> code must be assuming that there are not 1000s of publications, and
> therefore the skipTriggers string list will be short enough to ignore
> such inefficiency concerns.
>

code not part of patch.

> IMO a simpler alternative be to throw away the
> getPublicationEventTriggers() and the list scanning logic, and instead
> simply skip any event triggers where the name starts with
> PUB_EVENT_TRIG_PREFIX (e.g. the correct prefix, not the current code
> one -- see other review comment for pg_publication.h). Are there any
> problems doing it that way?
>

code not part of patch.

> ======
> src/bin/pg_dump/t/002_pg_dump.pl
>
> 9.
> create_sql => 'CREATE PUBLICATION pub2
> FOR ALL TABLES
> - WITH (publish = \'\');',
> + WITH (publish = \'\', ddl = \'\');',
> regexp => qr/^
> \QCREATE PUBLICATION pub2 FOR ALL TABLES WITH (publish = '');\E
>
> 9a.
> I was not sure of the purpose of this test. Is it for showing that
> ddl='' is equivalent to not having any ddl option?
>

Yes.

> ~
>
> 9b.
> Missing test cases for dumping other values? e.g. ddl='table'
>

Test cases in later patches.

> ======
> src/include/tcop/cmdtag.h
>
> 19.
> typedef enum CommandTag
> {
> #include "tcop/cmdtaglist.h"
> COMMAND_TAG_NEXTTAG
> } CommandTag;
>
> I know it is not part of this patch, but IMO it will be an improvement
> to rename that last enum (COMMAND_TAG_NEXTTAG) to a name like
> NUM_COMMAND_TAGS. This enum wasn't used much before, but now in this
> patch, you are using it within the new function like
> GetCommandTagsForDDLRepl() so keeping the current enum name
> COMMAND_TAG_NEXTTAG with that usage looked strange.
>
> Alternatively, leave this alone but change GetCommandTagsForDDLRepl()
> so that it does not even refer to this enum value. See other review
> comment #5b

Leaving this as of now, as its not part of patch.

> ======
> src/test/regress/expected/publication.out
>
> 21.
> The \dRp+ now exposes a new column called "Table DDLS"
>
> I expected to see some tests for t/f values but I did not find any
> test where the expected output for this column was 't'.
>

More tests to be added in later patches.

On Mon, Apr 17, 2023 at 10:17 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some more review comments for the patch 0002-2023_04_07-2
>
> Note: This is a WIP review (part 2); the comments in this post are
> only for the commit message and the PG docs

> ~~~
>
> 12.
> + <para>
> + DDL replication is disabled by default, it can be enabled at
> different levels
> + using the ddl PUBLICATION option. This option currently has one
> level and are
> + only allowed to be set if the PUBLICATION is FOR ALL TABLES or
> FOR TABLES IN SCHEMA.
> + </para>
> ~
>
> 12b.
> There should be more documentation for the ddl parameter on the CREATE
> PUBLICATION docs page and this should link to it.
>
> ~
>
> 12c.
> There should also be cross-refs to the "FOR ALL TABLES" and "FOR ALL
> TABLES IN SCHEMA" xrefs. See other LR SGML documentation for how we
> did all this recently.
>
> ~
>
> 13c.
> IMO should also be an example using CREATE PUBLICATION
>
> ~~~
>
> 14.
> + <para>
> + The DDL commands supported for logical replication are listed
> in the following
> + matrix. Note that global commands can be executed at any
> database and are currently
> + not supported for replication, global commands include ROLE
> statements, Database
> + statements, TableSpace statements and some of the
> GrantStmt/RevokeStmt if the target
> + object is a global object. Temporary and unlogged objects will
> not be replicated.
> + User should take care of creating these objects as these
> objects might be required
> + by the objects that are replicated (for example creation of
> tables that might
> + refer to an user created tablespace will fail in the subscriber
> if the user
> + created tablespaces are not created in the subscriber).
> + </para>

> ~
>
> 14b
> I felt that the whole "Note that..." might warrant actually being in
> some <note> SGML tag, so it renders as a proper note.
>
> ~~~
>
> 15.
> + <table id="ddl-replication-by-command-tag">
> + <title>DDL Replication Support by Command Tag</title>
> + <tgroup cols="3">
> + <colspec colname="col1" colwidth="2*"/>
> + <colspec colname="col2" colwidth="1*"/>
> + <colspec colname="col3" colwidth="1*"/>
> + <thead>
> + <row>
> + <entry>Command Tag</entry>
> + <entry>For Replication</entry>
> + <entry>Notes</entry>
> + </row>
> + </thead>
>
> 15a
> IMO this table will be more informative if the 2nd column is renamed
> to be "ddl = 'table'", then in future you can just add more columns
> when there are different values for that option.
>
> ~~~
>
> 17.
> + The DDL deparser exposes two SQL functions:
> + <itemizedlist>
>
> I imagine that these SQL functions should be documented elsewhere as well.
>
> Possibly on this page?
> https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-REPLICATION

The details of the syntax are still under discussion, and these will be
considered later.

On Wed, Apr 19, 2023 at 6:27 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some more WIP review comments for the patch 0002-2023_04_07-2
>
> This is a WIP review in parts because the patch was quite large, so it
> is taking a while...
>
> WIP part 1 [1] was posted 17/4.
> WIP part 2 [2] was posted 17/4.
>
> This is WIP part 3
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 99.
> + <table id="ddl-replication-by-command-tag">
> + <title>DDL Replication Support by Command Tag</title>
>
> This table is excessively long. I was thinking it might present the
> information more simply just by showing the interesting rows that DO
> support the replication, and have one final table row called "All
> other commands" that do NOT support the DDL replication.
>

Will update this in a later version.

> ~~
>
> 2.
> void
> logicalddlmsg_desc(StringInfo buf, XLogReaderState *record)
> {
> char *rec = XLogRecGetData(record);
> uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
>
> if (info == XLOG_LOGICAL_DDL_MESSAGE)
> {
> xl_logical_ddl_message *xlrec = (xl_logical_ddl_message *) rec;
> char *prefix = xlrec->message;
> char *message = xlrec->message + xlrec->prefix_size;
> char *sep = "";
>
> Assert(prefix[xlrec->prefix_size] != '\0');
>
> ~
>
> Something is a bit fishy with this Assert. See ddlmessage.h the
> comment says that the prefix size inclide the \0.
>
> So a prefix of "abc" and a payload of "ppp" would
> - Have a prefix_size 4
> - Have a prefix + message like "abc\0ppp"
>
> So IMO this Assert would made more sense written same as it was in the
> file logicalmsg.c
> Assert(prefix[xlrec->prefix_size - 1] == '\0');

The array index is already length + 1 as indices start from 0.

>
> And, if you also wanted to assert that there is some payload present
> then IMO you can do that better like:
> Assert(xlrec->message_size && *message);
>

kept it the same.

> ~~
>
> 3. logicalddlmsg_identify
>
> +const char *
> +logicalddlmsg_identify(uint8 info)
> +{
> + if ((info & ~XLR_INFO_MASK) == XLOG_LOGICAL_DDL_MESSAGE)
> + return "DDL";
> +
> + return NULL;
> +}
>
> I suspect there might be some inconsistencies. IIRC there were some
> other parts of the code (from my previous WIP reviews) that refer to
> these as "DDL MESSAGE", not just "DDL". I’m not sure which of those
> names you want, but I think it is better that they are all consistent
> no matter which code is naming them.
>

Have changed to just DDL

> ~~~
>
> 6. parse_publication_options
>
> + char *ddl_level;
> + List *ddl_list;
> + ListCell *lc3;
>
> IMO these are not good variable names.

> lc3 --> lc (because why lc3? it is not even in the same scope as the
> lc2 which I think was also a poor name)

Since in the outer loop of this code we use lc, I think it would be better to
avoid using the same name variable internally.
So renamed 'lc3' to 'lc2' to mark the nesting level.

> ~~~
>
> 7. parse_publication_options
>
> + *ddl_type_given = true;
> + ddl_level = defGetString(defel);
>
> It is curious that this patch added a strdup() for the similar code in
> the 'publish' option code, but do not do so here (??)
>

previous code changed.

> ~~~
>
> 9. GetTransformWhereClauses
>
> +/*
> + * Helper function to tranform a where clause.
> + *
> + * Also check the publication row filter expression and throw an error if
> + * anything not permitted or unexpected is encountered.
> + */
> +static Node *
> +GetTransformWhereClauses(const char *queryString, Relation relation,
> + Node *whereClause, bool check_expr)
>
> 9a.
> AFAICT this is a code refactoring just to make the caller
> (TransformPubWhereClauses) simpler by moving some inline code to a
> separate static function. But, I did not see how this refactoring
> should be part of this patch.
>

This has been removed in previous version.

On Fri, Apr 21, 2023 at 10:26 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some more review comments for the patch 0002-2023_04_07-2
>
> This was a WIP review in parts because the patch was quite large:
>
> WIP part 1 [1] was posted 17/4.
> WIP part 2 [2] was posted 17/4.
> WIP part 3 [3] was posted 19/4.
> WIP part 4 is this post. (This is my final WIP part for this 0002 patch)
>
> ======
> contrib/test_decoding/sql/ddl.sql
>
> 1.
> +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394,
> 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I
> %{authorization}s", "name": "foo", "authorization": {"fmt":
> "AUTHORIZATION %{authorization_role}I", "present": false,
> "authorization_role": null}, "if_not_exists": ""}');
>
> I wasn't entirely sure what are these tests showing. It seems to do
> nothing but hardwire a bunch of random stuff and then print it out
> again. Am I missing something?
>
> And are those just bogus content payloads? Maybe they are valid JSON
> but AFAICT nobody is using them. What is the advantage of using this
> bogus payload data instead of just a simple string like "DDL message
> content goes here"?
>

the idea is to test if any code changes change this JSON and if so the test fails.

> ======
> contrib/test_decoding/test_decoding.c
>
> 2. _PG_output_plugin_init
>
> cb->message_cb = pg_decode_message;
> + cb->ddl_cb = pg_decode_ddl_message;
> cb->filter_prepare_cb = pg_decode_filter_prepare;
>
> Where is the 'stream_ddl_cb' to match this?
>

Will add in a later version.

> ~~~
>
> 6. publication_deparse_ddl_command_start
>
> +/*
> + * Deparse the ddl command and log it prior to
> + * execution. Currently only used for DROP TABLE command
> + * so that catalog can be accessed before being deleted.
> + * This is to check if the table is part of the publication
> + * or not.
> + */
> +Datum
> +publication_deparse_ddl_command_start(PG_FUNCTION_ARGS)
> +{
> + EventTriggerData *trigdata;
> + char *command = psprintf("Drop table command start");
>
> Since information about this only being for DROP is hardcoded and in
> the function comment, shouldn't this whole function be renamed to
> something DROP-specific? e.g
> publication_deparse_ddl_drop_command_start()
>

Not changing this, now its used only for drop, but the callback is for all
"command starts".

> ~~~
>
> 11. publication_deparse_ddl_command_end
>
> + if (cmd->type == SCT_Simple &&
> + !OidIsValid(cmd->d.simple.address.objectId))
> + continue;
> +
> + if (cmd->type == SCT_AlterTable)
> + {
> + relid = cmd->d.alterTable.objectId;
> + type = DCT_TableAlter;
> + }
> + else
> + {
> + /* Only SCT_Simple for now */
> + relid = cmd->d.simple.address.objectId;
> + type = DCT_SimpleCmd;
> + }
>
> This code seemed structured a bit strangely to me; The comment /* Only
> SCT_Simple for now */ appears to be expecting something that may not
> be guaranteed. Perhaps the below-suggested code is closer to what was
> intended?
>
> SUGGESTION (should it be like this?)
>
> if (cmd->type == SCT_AlterTable)
> {
> relid = cmd->d.alterTable.objectId;
> type = DCT_TableAlter;
> }
> else
> {
> /* Only SCT_Simple for now */
> if (cmd->type != SCT_Simple)
> continue;
>
> if (!OidIsValid(cmd->d.simple.address.objectId))
> continue;
> relid = cmd->d.simple.address.objectId;
> type = DCT_SimpleCmd;
> }
>

Keeping this the same for now.

> ~~~
>
> 21. apply_handle_ddl
>
> + commandTag = CreateCommandTag((Node *) command);
> +
> + /* If we got a cancel signal in parsing or prior command, quit */
> + CHECK_FOR_INTERRUPTS();
> +
> + /* Remove data population from the command */
> + preprocess_create_table(command);
>
> There seems to be an assumption here that the only kind of command
> processed here would be TABLE related. Maybe that is currently true,
> but shouldn't there be some error checking just to make sure it cannot
> execute unexpected commands?

That check is inside the function, no need to duplicate it outside as it
requires extracting commandtag.

> ~~~
>
> 26. init_txn_data/clean_txn_data
>
> Hmm, this refactoring to isolate the alloc/free of this private data
> and to delegate to these new functions from a number of places looked
> to me more like a bug-fix which is not really related to the DDL
> replication. I guess what has happened is that when more information
> (field 'deleted_relids') was added to the PGOutputTxnData it exposed
> this problem more visibly (??)
>
> To summarize, I thought all this stuff about
> init_txn_data/clean_txn_data refactoring should probably be removed
> from this patch and instead pushed as a separate bug fix to HEAD.
>
> What do you think?
>

Not sure about this, these functions are only useful in the current patch.

> ~~~
>
> 28. pgoutput_change
>
> + if (table_rewrite)
> + RelationClose(relation);
> +
>
> Something doesn't seem right. AFAICT this cleanup code has been added
> to match the new code at the top of the function, where the "actual
> relation" was fetched.
>
> Meanwhile, there are also some other return points where
> 'table_rewrite' is true:
> e.g.
> if (table_rewrite && !relentry->pubactions.pubddl_table)
> return;
>
> So why is there no RelationClose(relation) for those other returns?
>

this has been rewritten. code does not match comments.

> ~~~
>
> 33. reload_publications
>
> +/* Reload publications if needed. */
> +static void
> +reload_publications(PGOutputData *data)
> +{
> + MemoryContext oldctx;
> +
> + if (!publications_valid)
> + {
> + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> + if (data->publications)
> + {
> + list_free_deep(data->publications);
> + data->publications = NIL;
> + }
> + data->publications = LoadPublications(data->publication_names);
> + MemoryContextSwitchTo(oldctx);
> + publications_valid = true;
> + }
> +}
> +
> +
>
> 33a.
> AFAICT this appears to be a general cleanup refactoring that is not
> really related to the DDL replication patch. So I felt this can be
> removed from this patch and applied as a separate patch to HEAD.

This functions was used in a later patch for INDEX replication,
so removed it for now.

Other comments have been addressed.
Attach the new patch set.

Best Regards
Hou zj

Attachment Content-Type Size
ddl-replication-2023_05_02.tar.gz application/x-gzip 148.6 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message shveta malik 2023-05-02 03:00:51 Re: Support logical replication of DDLs
Previous Message Celso Lorenzetti 2023-05-01 22:47:34 Compiling postgres for windows with src/tools/msvc.build.bat

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-05-02 02:50:26 Re: Autogenerate some wait events code and documentation
Previous Message Michael Paquier 2023-05-02 02:17:50 Re: Improve logging when using Huge Pages