Re: Support logical replication of DDLs

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Zheng Li <zhengli10(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2022-12-19 13:19:14
Message-ID: CAFPTHDYqL+65ApSZWBmFRpc4aeYaW3UAeGhyYWOQJNx1eAQ+og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, Oct 31, 2022 at 7:07 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some more comments for the patch v32-0001, file:
> src/backend/commands/ddl_deparse.c
>
> This is a WIP, it being such a large file...
>
> ======
>
> 1. General - comments
>
> For better consistency, I suggest using uppercase for all the
> single-line comments in the function bodies.
>
> There are multiple of them - I am not going to itemize them all in
> this post. Please just search/replace all of them
>
> e.g.
> /* add the "ON table" clause */
> /* add the USING clause, if any */
> /* add the USING clause, if any */
>

fixed.

> ~~~
>
> 2. General - object names
>
> There is a bit of inconsistency with the param object names where
> there are multi-words.
>
> Some have underscore (e.g. "is_public", "subtype_diff", "if_not_exists", etc)...
> Many others do not (e.g. "schemaname", "objname", "rolename", etc)...
>
> IMO it would be better to use a consistent naming convention - e,g,
> maybe use '_' *everywhere*
>
>

fixed.

> ~~~
>
> 3. ObjTree
>
> +typedef struct ObjTree
> +{
> + slist_head params; /* Object tree parameters */
> + int numParams; /* Number of parameters in the object tree */
> + StringInfo fmtinfo; /* Format string of the ObjTree */
> + bool present; /* Indicates if boolean value should be stored */
> +} ObjTree;
>
> It seems that this member is called "parameters" in the sense that
> each of these params are destined to be substition-params of for the
> format string part of this struct.
>
> OK. That seems sensible here, but this 'parameter' terminology infests
> this whole source file. IIUC really much of the code is dealing with
> just JSON objects -- they don't become parameters until those objects
> get added into the params list of this structure. Basically, I felt
> the word 'parameter' in comments and the variables called 'param' in
> functions seemed a bit overused...
>
> ~~~
>
> 4. ObjElem
>
> + slist_node node; /* Used in converting back to ObjElem
> + * structure */
> +} ObjElem;
>
> At face value (and without yet seeing the usage), that comment about
> 'node' does not mean much. e.g. this is already an 'ObjElem' struct...
> (??)
>
> ~~~
>
> 5. verbose
>
> +/*
> + * Reduce some unncessary string from the output json stuff when verbose
> + * and "present" member is false. This means these strings won't be merged into
> + * the last DDL command.
> + */
> +bool verbose = true;
>
> The comment needs some rewording to explain what this is about more
> clearly and without the typos
>
> "Reduce some unncessary string from the output json stuff" ???
>
> ~~~

fixed.

>
> 6. add_policy_clauses
>
> + else
> + {
> + append_bool_object(policyStmt, "present", false);
> + }
>
> Something seems strange. Probably I'm wrong but just by code
> inspection it looks like there is potential for there to be multiple
> param {present:false} JSON objects:
>
> {"present" :false},
> {"present" :false},
> {"present" :false},
>
> Shouldn't those all be array elements or something? IIUC apart from
> just DDL, the JSON idea was going to (in future) allow potential
> machine manipulation of the values prior to the replication, but
> having all these ambiguous-looking objects does not seem to lend
> itself to that idea readily. How to know what are each of those params
> representing?
>

This is pruned later and false objects removed.

> ~~~
>
> 7. append_array_object
>
>
> + }
> +
> + }
>
> Spurious blank line
>
> ~~
>

fixed.

> 8.
>
> + /* Extract the ObjElems whose present flag is true */
> + foreach(lc, array)
> + {
> + ObjElem *elem = (ObjElem *) lfirst(lc);
> +
> + Assert(elem->objtype == ObjTypeObject ||
> + elem->objtype == ObjTypeString);
> +
> + if (!elem->value.object->present &&
> + elem->objtype == ObjTypeObject)
> + array = foreach_delete_current(array, lc);
> + }
> +
> + }
>
> 8a.
> Is that comment correct? Or should it say more like "remove elements
> where present flag is false" ??
>
> 8b.
> It's not clear to me what is going to be the result of deleting the
> array elements that are determined not present. Will this affect the
> length of the array written to JSON? What if there is nothing left at
> all - the top of this function return if the array length is zero, but
> the bottom(after the loop) has not got similar logic.
>

fixed, added a check at the end of the loop.

> ~~~
>
> 9. append_bool_object
>
> + /*
> + * Check if the present is part of the format string and store the boolean
> + * value
> + */
> + if (strcmp(sub_fmt, "present") == 0)
>
> The comment seems not right. Looks like not testing "present" is PART
> of the format string - it is testing it IS the ENTIRE format string.
>

fixed.

> ~~~
>
> 10. append_object_to_format_string
>
> + initStringInfo(&object_name);
> + end_ptr = sub_fmt + strlen(sub_fmt);
> +
> + for (cp = sub_fmt; cp < end_ptr; cp++)
> + {
> + if (*cp == '{')
> + {
> + start_copy = true;
> + continue;
> + }
> +
> + if (!start_copy)
> + continue;
> +
> + if (*cp == ':' || *cp == '}')
> + break;
> +
> + appendStringInfoCharMacro(&object_name, *cp);
> + }
>
> Instead of this little loop why doesn't the code just look for the
> name delimiters?
>
> e.g.
> pstart = strch(sub_fmt, '{');
> pend = strbrk(pstart, ":}");
>
> then the 'name' is what lies in between...
>
> ~~~
>
> 11.
>
> format_type_detailed(Oid type_oid, int32 typemod,
> Oid *nspid, char **typname, char **typemodstr,
> bool *typarray)
>
>
> There seems a bit mixture of param prefixes of both 'typ' and 'type'.
> Is it correct? If these are changed, check also in the function
> comment.
>

fixed.

> ~~~
>
> 12.
>
> + /*
> + * Special-case crock for types with strange typmod rules where we put
> + * typmod at the middle of name(e.g. TIME(6) with time zone ). We cannot
> + * schema-qualify nor add quotes to the type name in these cases.
> + */
>
> Missing space before '(e.g.'. Extra space before ').'.
>

fixed.

> ~~~
>
> 13. FunctionGetDefaults
>
> /*
> * Return the defaults values of arguments to a function, as a list of
> * deparsed expressions.
> */
>
> "defaults values" -> "default values"
>

fixed.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v53-0002-Support-DDL-replication.patch application/octet-stream 132.2 KB
v53-0005-Skip-ALTER-TABLE-subcommands-generated-for-Table.patch application/octet-stream 2.2 KB
v53-0003-Support-CREATE-TABLE-AS-SELECT-INTO.patch application/octet-stream 15.6 KB
v53-0004-Test-cases-for-DDL-replication.patch application/octet-stream 24.6 KB
v53-0001-Functions-to-deparse-DDL-commands.patch application/octet-stream 323.3 KB
v53-0006-Support-DDL-replication-of-alter-type-having-USI.patch application/octet-stream 9.0 KB
v53-0007-Introduce-the-test_ddl_deparse_regress-test-modu.patch application/octet-stream 46.7 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2022-12-19 15:29:49 Re: Support logical replication of DDLs
Previous Message li jie 2022-12-19 10:02:17 Re: Support logical replication of DDLs

Browse pgsql-hackers by date

  From Date Subject
Next Message Isaac Morland 2022-12-19 13:44:39 Re: GROUP BY ALL
Previous Message Amit Kapila 2022-12-19 12:47:25 Re: Perform streaming logical transactions by background workers and parallel apply