Re: Support logical replication of DDLs

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Zheng Li <zhengli10(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2022-11-11 05:09:04
Message-ID: CAHut+PtHvH66uvih23Rm4Ajm2suHEVv1W5NJ1Y8tsQrn=u9qhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Nov 11, 2022 at 3:47 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are more review comments for the v32-0001 file ddl_deparse.c
>
> *** NOTE - my review post became too big, so I split it into smaller parts.

THIS IS PART 2 OF 4.

=======

src/backend/commands/ddl_deparse.c

1. deparse_AlterExtensionStmt

+/*
+ * Deparse an AlterExtensionStmt (ALTER EXTENSION .. UPDATE TO VERSION)
+ *
+ * Given an extension OID and a parse tree that modified it, return an ObjTree
+ * representing the alter type.
+ */
+static ObjTree *
+deparse_AlterExtensionStmt(Oid objectId, Node *parsetree)

Spurious blank space before "OID"

~

2.

+ ObjTree *stmt;
+ ObjTree *tmp;
+ List *list = NIL;
+ ListCell *cell;

Variable 'tmp' can be declared only in the scope that it is used.

~

3.

+ foreach(cell, node->options)
+ {
+ DefElem *opt = (DefElem *) lfirst(cell);
+
+ if (strcmp(opt->defname, "new_version") == 0)
+ {
+ tmp = new_objtree_VA("TO %{version}L", 2,
+ "type", ObjTypeString, "version",
+ "version", ObjTypeString, defGetString(opt));
+ list = lappend(list, new_object_object(tmp));
+ }
+ else
+ elog(ERROR, "unsupported option %s", opt->defname);
+ }

This code seems strange to be adding new versions to a list. How can
there be multiple new versions? It does not seem compatible with the
command syntax [1]

------

4. deparse_CreateCastStmt

+ initStringInfo(&func);
+ appendStringInfo(&func, "%s(",
+ quote_qualified_identifier(get_namespace_name(funcForm->pronamespace),
+ NameStr(funcForm->proname)));
+ for (i = 0; i < funcForm->pronargs; i++)
+ appendStringInfoString(&func,
+ format_type_be_qualified(funcForm->proargtypes.values[i]));
+ appendStringInfoChar(&func, ')');

Is this correct, or should there be some separators (e.g. commas)
between multiple arg-types?

------

5. deparse_AlterDefaultPrivilegesStmt

+
+static ObjTree *
+deparse_AlterDefaultPrivilegesStmt(CollectedCommand *cmd)

Missing function comment

~

6.

+ schemas = lappend(schemas,
+ new_string_object(strVal(val)));

Unnecessary wrapping.

~

7.

+ /* Add the IN SCHEMA clause, if any */
+ tmp = new_objtree("IN SCHEMA");
+ append_array_object(tmp, "%{schemas:, }I", schemas);
+ if (schemas == NIL)
+ append_bool_object(tmp, "present", false);
+ append_object_object(alterStmt, "%{in_schema}s", tmp);
+
+ /* Add the FOR ROLE clause, if any */
+ tmp = new_objtree("FOR ROLE");
+ append_array_object(tmp, "%{roles:, }R", roles);
+ if (roles == NIL)
+ append_bool_object(tmp, "present", false);
+ append_object_object(alterStmt, "%{for_roles}s", tmp);

I don't really understand why the logic prefers to add a whole new
empty tree with "present: false" versus just adding nothing at all
unless it is relevant.

~

8.

+ if (stmt->action->is_grant)
+ grant = new_objtree("GRANT");
+ else
+ grant = new_objtree("REVOKE");
+
+ /* add the GRANT OPTION clause for REVOKE subcommand */
+ if (!stmt->action->is_grant)
+ {
+ tmp = new_objtree_VA("GRANT OPTION FOR",
+ 1, "present", ObjTypeBool,
+ stmt->action->grant_option);
+ append_object_object(grant, "%{grant_option}s", tmp);
+ }

That 2nd 'if' can just be combined with the 'else' logic of the prior if.

~

9.

+ Assert(priv->cols == NIL);
+ privs = lappend(privs,
+ new_string_object(priv->priv_name));

Unnecessary wrapping.

------

10. deparse_AlterTableStmt

Maybe this function name should be different because it is not only
for TABLEs but also serves for INDEX, VIEW, TYPE, etc

~

11.

AFAICT every case in the switch (subcmd->subtype) is doing subcmds =
lappend(subcmds, new_object_object(tmpobj));

Just doing this in common code at the end might be an easy way to
remove ~50 lines of duplicate code.

------

12. deparse_ColumnDef

+ * NOT NULL constraints in the column definition are emitted directly in the
+ * column definition by this routine; other constraints must be emitted
+ * elsewhere (the info in the parse node is incomplete anyway.).
+ */
+static ObjTree *
+deparse_ColumnDef(Relation relation, List *dpcontext, bool composite,
+ ColumnDef *coldef, bool is_alter, List **exprs)

"anyway.)." -> "anyway)."

~

13.

+ /* USING clause */
+ tmpobj = new_objtree("COMPRESSION");
+ if (coldef->compression)
+ append_string_object(tmpobj, "%{compression_method}I", coldef->compression);
+ else
+ {
+ append_null_object(tmpobj, "%{compression_method}I");
+ append_bool_object(tmpobj, "present", false);
+ }

Why is it necessary to specify a NULL compression method if the entire
"COMPRESSION" is anyway flagged as present=false?

~

14.

+ foreach(cell, coldef->constraints)
+ {
+ Constraint *constr = (Constraint *) lfirst(cell);
+
+ if (constr->contype == CONSTR_NOTNULL)
+ saw_notnull = true;
+ }

Why not break immediately from this loop the first time you find
'saw_notnull' true?

~~~

15.

+ tmpobj = new_objtree("DEFAULT");
+ if (attrForm->atthasdef)
+ {
+ char *defstr;
+
+ defstr = RelationGetColumnDefault(relation, attrForm->attnum,
+ dpcontext, exprs);
+
+ append_string_object(tmpobj, "%{default}s", defstr);
+ }
+ else
+ append_bool_object(tmpobj, "present", false);
+ append_object_object(column, "%{default}s", tmpobj);

Something seems a bit strange here. It looks like there are formats
called "%{default}s" at 2 levels in this tree, so will it cause a
hierarchy of objects with the same name?

------

16. deparse_ColumnIdentity

+ column = new_objtree("");
+
+ if (!OidIsValid(seqrelid))
+ {
+ append_bool_object(column, "present", false);
+ return column;
+ }

I don't really understand the point of making empty tree structures
for not "present" elements. IIUC this is just going to make the tree
bigger for no reason and all these not "present" branches will be
ultimately thrown away, right? I guess the justification is that it
might be for debugging/documentation but that does not really stand up
in this case because it seems like just a nameless tree here.

------

17. deparse_CreateDomain

+ createDomain = new_objtree("CREATE");
+
+ append_object_object(createDomain,
+ "DOMAIN %{identity}D AS",
+ new_objtree_for_qualname_id(TypeRelationId,
+ objectId));
+ append_object_object(createDomain,
+ "%{type}T",
+ new_objtree_for_type(typForm->typbasetype, typForm->typtypmod));
+
+ if (typForm->typnotnull)
+ append_string_object(createDomain, "%{not_null}s", "NOT NULL");
+ else
+ append_string_object(createDomain, "%{not_null}s", "");

17a.
I don't understand why this is not just a single _VA() call instead of
spread over multiple append_objects like this.

~

17b.
In other places, something like the "%{not_null}s" is done with a
ternary operator instead of the excessive if/else.

------

18. deparse_CreateFunction

+ if (isnull)
+ probin = NULL;
+ else
+ {
+ probin = TextDatumGetCString(tmpdatum);
+ if (probin[0] == '\0' || strcmp(probin, "-") == 0)
+ probin = NULL;
+ }

Maybe it is simpler to assign prbin = NULL where it is declared, then
here you only need to test the !isnull case.

~

19.

+ append_string_object(createFunc, "%{or_replace}s",
+ node->replace ? "OR REPLACE" : "");

It is not clear to me what is the point of such code - I mean if
node->replace is false why do append at all? ... Why not use
appen_format_string() instead()?
My guess is that this way is preferred to simplify the calling code,
but knowing that a "" value will just do nothing anyway - seems an
overcomplicated way to do it though.

~

20.

+ typarray = palloc(list_length(node->parameters) * sizeof(Oid));
+ if (list_length(node->parameters) > procForm->pronargs)
+ {
+ Datum alltypes;
+ Datum *values;
+ bool *nulls;
+ int nelems;
+
+ alltypes = SysCacheGetAttr(PROCOID, procTup,
+ Anum_pg_proc_proallargtypes, &isnull);
+ if (isnull)
+ elog(ERROR, "NULL proallargtypes, but more parameters than args");
+ deconstruct_array(DatumGetArrayTypeP(alltypes),
+ OIDOID, 4, 't', 'i',
+ &values, &nulls, &nelems);
+ if (nelems != list_length(node->parameters))
+ elog(ERROR, "mismatched proallargatypes");
+ for (i = 0; i < list_length(node->parameters); i++)
+ typarray[i] = values[i];
+ }
+ else
+ {
+ for (i = 0; i < list_length(node->parameters); i++)
+ typarray[i] = procForm->proargtypes.values[i];
+ }

The list_length(node->parameters) is used multiple times here; it
might have been cleaner code to assign that to some local variable.

~

21.

+ * Note that %{name}s is a string here, not an identifier; the reason
+ * for this is that an absent parameter name must produce an empty
+ * string, not "", which is what would happen if we were to use
+ * %{name}I here. So we add another level of indirection to allow us
+ * to inject a "present" parameter.
+ */

The above comment says:
must produce an empty string, not ""

I didn't get the point - what is the difference between an empty string and ""?

~

22.

+ append_string_object(paramobj, "%{mode}s",
+ param->mode == FUNC_PARAM_IN ? "IN" :
+ param->mode == FUNC_PARAM_OUT ? "OUT" :
+ param->mode == FUNC_PARAM_INOUT ? "INOUT" :
+ param->mode == FUNC_PARAM_VARIADIC ? "VARIADIC" :
+ "IN");

There doesn't seem to be much point to test for param->mode ==
FUNC_PARAM_IN here since "IN" is the default mode anyway.

~

23.

+ name = new_objtree("");
+ append_string_object(name, "%{name}I",
+ param->name ? param->name : "NULL");
+
+ append_bool_object(name, "present",
+ param->name ? true : false);

IIUC it is uncommon to inject a "present" object if it was "true", so
why do it like that here?

~

24.

+ append_format_string(tmpobj, "(");
+ append_array_object(tmpobj, "%{arguments:, }s", params);
+ append_format_string(tmpobj, ")");

Is it necessary to do that in 3 lines? IIUC it would be the same if
the parens were just included in the append_array_object format,
right?

~

25.

+ if (procForm->prosupport)
+ {
+ Oid argtypes[1];
+
+ /*
+ * We should qualify the support function's name if it wouldn't be
+ * resolved by lookup in the current search path.
+ */
+ argtypes[0] = INTERNALOID;

Might as well just declare this as:

Oid argtypes[] = { INTERNALOID };

------

26. deparse_CreateOpClassStmt

+
+ stmt = new_objtree_VA("CREATE OPERATOR CLASS %{identity}D", 1,
+ "identity", ObjTypeObject,
+ new_objtree_for_qualname(opcForm->opcnamespace,
+ NameStr(opcForm->opcname)));
+
+ /* Add the DEFAULT clause */
+ append_string_object(stmt, "%{default}s",
+ opcForm->opcdefault ? "DEFAULT" : "");
+
+ /* Add the FOR TYPE clause */
+ append_object_object(stmt, "FOR TYPE %{type}T",
+ new_objtree_for_type(opcForm->opcintype, -1));
+
+ /* Add the USING clause */
+ append_string_object(stmt, "USING %{amname}I",
get_am_name(opcForm->opcmethod));

This can all be done just as a single VA call I think.

~

27.

+ append_format_string(tmpobj, "(");
+ append_array_object(tmpobj, "%{argtypes:, }T", arglist);
+ append_format_string(tmpobj, ")");

AFAIK this can just be done by a single call including the parens in
the format string of appen_array_object.

------

28. deparse_CreatePolicyStmt

+
+static ObjTree *
+deparse_CreatePolicyStmt(Oid objectId, Node *parsetree)

Missing function comment.

~

29.

+ /* Add the rest of the stuff */
+ add_policy_clauses(policy, objectId, node->roles, !!node->qual,
+ !!node->with_check);

The !! to cast the pointer parameter to boolean is cute, but IIUC that
is not commonly used in the PG source. Maybe it is more conventional
to just pass node->qual != NULL etc?

------

30. deparse_AlterPolicyStmt

+
+static ObjTree *
+deparse_AlterPolicyStmt(Oid objectId, Node *parsetree)

Missing function comment.

~

31.

+ /* Add the rest of the stuff */
+ add_policy_clauses(policy, objectId, node->roles, !!node->qual,
+ !!node->with_check);

The !! to cast the pointer parameter to boolean is cute, but IIUC that
technique is not commonly used in the PG source. Maybe it is more
conventional to just pass node->qual != NULL etc?

------

32. deparse_CreateSchemaStmt

+ else
+ {
+ append_null_object(auth, "%{authorization_role}I ");
+ append_bool_object(auth, "present", false);
+ }

32a.
Why append a NULL object if the "present" says it is false anyway?

~

32b.
"%{authorization_role}I " -- why do they have extra space on the end?
Just let the append_XXX functions can take care of the space
separators automagically instead.

------

33. deparse_AlterDomainStmt

+ {
+ fmt = "ALTER DOMAIN";
+ type = "drop default";
+ alterDom = new_objtree_VA(fmt, 1, "type", ObjTypeString, type);

This code style of assigning the 'fmt' and 'type' like this is not
typical of all the other deparse_XXX functions which just pass
parameter literals. Also, I see no good reason that the 'fmt' is
unconditionally assigned to "ALTER DOMAIN" in 6 different places.

~

34.

AFAICT all these cases can be simplified to use single VA() calls and
remove all the append_XXX.

~

35.

+
+ break;
+ case 'N':

Spurious or misplaced blank line.

~

36.

+ case 'C':
+
+ /*
+ * ADD CONSTRAINT. Only CHECK constraints are supported by
+ * domains
+ */

A spurious blank line is inconsistent with the other cases.

~

36.

+
+ break;
+ default:

Spurious or misplaced blank line.

------

37. deparse_CreateStatisticsStmt

+ append_format_string(createStat, "FROM");
+
+ append_object_object(createStat, "%{stat_table_identity}D",
+ new_objtree_for_qualname(get_rel_namespace(statform->stxrelid),
+ get_rel_name(statform->stxrelid)));

It would be easier to do things like this using a single call using a
format of "FROM %{stat_table_identity}D", rather than have the extra
append_format_string call.

------

38. deparse_CreateForeignServerStmt

+ /* Add a TYPE clause, if any */
+ tmp = new_objtree_VA("TYPE", 0);
+ if (node->servertype)
+ append_string_object(tmp, "%{type}L", node->servertype);
+ else
+ append_bool_object(tmp, "present", false);
+ append_object_object(createServer, "%{type}s", tmp);
+
+ /* Add a VERSION clause, if any */
+ tmp = new_objtree_VA("VERSION", 0);

Why use the VA() function if passing 0 args?

~

39.

+ append_string_object(createServer, "FOREIGN DATA WRAPPER %{fdw}I",
node->fdwname);
+ /* add an OPTIONS clause, if any */
+ append_object_object(createServer, "%{generic_options}s",
+ deparse_FdwOptions(node->options, NULL));

39a.
Use uppercase comment.

~

39b.
Missing blank line above comment?

------

40. deparse_AlterForeignServerStmt

+ /* Add a VERSION clause, if any */
+ tmp = new_objtree_VA("VERSION", 0);

Why use the VA() function if passing 0 args?

~

41.

+ /* Add a VERSION clause, if any */
+ tmp = new_objtree_VA("VERSION", 0);
+ if (node->has_version && node->version)
+ append_string_object(tmp, "%{version}L", node->version);
+ else if (node->has_version)
+ append_string_object(tmp, "version", "NULL");
+ else
+ append_bool_object(tmp, "present", false);
+ append_object_object(alterServer, "%{version}s", tmp);
+
+ /* Add a VERSION clause, if any */
+ tmp = new_objtree_VA("VERSION", 0);
+ if (node->has_version && node->version)
+ append_string_object(tmp, "%{version}L", node->version);
+ else if (node->has_version)
+ append_string_object(tmp, "version", "NULL");
+ else
+ append_bool_object(tmp, "present", false);
+ append_object_object(alterServer, "%{version}s", tmp);

Huh? Looks like a cut/paste error of duplicate VERSION clauses. Is this correct?

------

42. deparse_CreateStmt

+ if (tableelts == NIL)
+ {
+ tmpobj = new_objtree("");
+ append_bool_object(tmpobj, "present", false);
+ }
+ else
+ tmpobj = new_objtree_VA("(%{elements:, }s)", 1,
+ "elements", ObjTypeArray, tableelts);

This fragment seemed a bit complicated. IIUC this is the same as just:

tmpobj = new_objtree("");
if (tableelts)
append_array_object(tmpobj, "(%{elements:, }s)", tableelts);
else
append_bool_object(tmpobj, "present", false);

~

43.

+ tmpobj = new_objtree("INHERITS");
+ if (list_length(node->inhRelations) > 0)
+ append_array_object(tmpobj, "(%{parents:, }D)",
deparse_InhRelations(objectId));
+ else
+ {
+ append_null_object(tmpobj, "(%{parents:, }D)");
+ append_bool_object(tmpobj, "present", false);
+ }
+ append_object_object(createStmt, "%{inherits}s", tmpobj);

43a.
AFAIK convention for checking non-empty List is just "if
(node->inhRelations != NIL)" or simply "if (node->inhRelations)

~

43b.
Maybe I misunderstand something but I don't see why append_null_object
is needed for tree marked as "present"=false anyhow. This similar
pattern happens multiple times in this function.

------

44. deparse_DefineStmt

+ switch (define->kind)
+ {

IMO better to put all these OBJECT_XXX cases in alphabetical order
instead of just random.

~

45.

+ default:
+ elog(ERROR, "unsupported object kind");
+ }

Should this also log what the define->kind was attempted?

------

46. deparse_DefineStmt_Collation

+ stmt = new_objtree_VA("CREATE COLLATION", 0);
+
+ append_object_object(stmt, "%{identity}D",
+ new_objtree_for_qualname(colForm->collnamespace,
+ NameStr(colForm->collname)));

Why not combine there to avoid VA args with 0 and use VA args with 1 instead?

~

47.

+ if (fromCollid.objectId != InvalidOid)

Use OisIsValid macro.

~

48.

+ append_object_object(stmt, "FROM %{from_identity}D",
+ new_objtree_for_qualname(fromColForm->collnamespace,
+ NameStr(fromColForm->collname)));
+
+
+ ReleaseSysCache(tp);
+ ReleaseSysCache(colTup);
+ return stmt;

Extra blank line.

~

49.

+ if (!isnull)
+ {
+ tmp = new_objtree_VA("LOCALE=", 1,
+ "clause", ObjTypeString, "locale");
+ append_string_object(tmp, "%{locale}L",
+ psprintf("%s", TextDatumGetCString(datum)));

IMO it should be easy enough to express this using a single VA(2 args)
function, so avoiding the extra append_string. e.g. other functions
like deparse_DefineStmt_Operator do this.

And this same comment also applies to the rest of this function:
- tmp = new_objtree_VA("LC_COLLATE=", 1,
- tmp = new_objtree_VA("LC_CTYPE=", 1,
- tmp = new_objtree_VA("PROVIDER=", 1,
- tmp = new_objtree_VA("PROVIDER=", 1,
- tmp = new_objtree_VA("DETERMINISTIC=", 1,
- tmp = new_objtree_VA("VERSION=", 1,

------
[1] ALTER EXTENSION - https://www.postgresql.org/docs/15/sql-alterextension.html

Kind Regards.
Peter Smith
Fujitsu Australia

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter Smith 2022-11-11 05:17:58 Re: Support logical replication of DDLs
Previous Message Peter Smith 2022-11-11 04:47:24 Re: Support logical replication of DDLs

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-11-11 05:09:06 Re: Typo about subxip in comments
Previous Message Michael Paquier 2022-11-11 04:53:59 Re: Avoid overhead open-close indexes (catalog updates)