Re: Support logical replication of DDLs

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Zheng Li <zhengli10(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support logical replication of DDLs
Date: 2022-10-13 12:46:39
Message-ID: CAFPTHDZ1b2Vjnh0up+C1Nrcy=zrS3X=vnQSqFc8BuNqPoR6Jig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Thu, Oct 6, 2022 at 7:31 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> The patches here are quite large, so for this review post, I have only
> done a quick check for cosmetic stuff in the comments of patch
> v24-0001.
>
> ~
>
> I did this mostly just by cutting/pasting the whole patch text into a
> grammar/spell-checker to see what it reported. Please use the same
> strategy prior to posting future patch versions, because it will be
> way more efficient/easier for the author to spend a few minutes to fix
> everything like this up-front before posting, rather than getting a
> flood of review comments to deal with (like this post) about such
> stuff.
>
> (BTW, most of these suggestions are just verbatim what my
> grammar/spell-checker told me)
>
> ======
>
> 1. Commit comment
>
> (Note #2) Note that, for ATTACH/DETACH PARTITION, we haven't added
> extra logic on
> subscriber to handle the case where the table on publisher is a PARTITIONED
> TABLE while the target table on subcriber side is NORMAL table. We will
> research this more and improve this later.
>
> SUGGESTION (minor changes + fix typo)
> (Note #2) Note that for ATTACH/DETACH PARTITION we haven't added extra logic on
> the subscriber to handle the case where the table on the publisher is
> a PARTITIONED
> TABLE while the target table on the subscriber is a NORMAL table. We will
> research this more and improve it later.
>
> ======
>
> 2. GENERAL - uppercase the comments
>
> Please make all your single-line comments start with uppercase for consistency.
>
> Here are some examples to fix:
>
> Line 303: + /* add the "ON table" clause */
> Line 331: + /* add the USING clause, if any */
> Line 349: + /* add the WITH CHECK clause, if any */
> Line 653: + /* otherwise, WITH TZ is added by typmod. */
> Line 663: + /* otherwise, WITH TZ is added by typmode. */
> Line 1946: + /* build list of privileges to grant/revoke */
> Line 2017: + /* target objects. We use object identities here */
> Line 2041: + /* list of grantees */
> Line 2053: + /* the wording of the grant option is variable ... */
> Line 2632: + /* skip this one; we add one unconditionally below */
> Line 2660: + /* done */
> Line 2768: + /* add HANDLER clause */
> Line 2780: + /* add VALIDATOR clause */
> Line 2792: + /* add an OPTIONS clause, if any */
> Line 2845: + /* add HANDLER clause */
> Line 2859: + /* add VALIDATOR clause */
> Line 2877: + /* add an OPTIONS clause, if any */
> Line 5024: + /* add the rest of the stuff */
> Line 5040: + /* add the rest of the stuff */
> Line 5185: + /* a new constraint has a name and definition */
> Line 5211: + /* done */
> Line 6124: + /* add a CONCURRENTLY clause */
> Line 6127: + /* add the matview name */
> Line 6131: + /* add a WITH NO DATA clause */
> Line 6302: + /* reloptions */
> Line 6310: + /* tablespace */
> Line 6592: + /* deconstruct the name list */
> Line 6636: + /* deconstruct the name list */
> Line 6725: + /* obtain object tuple */
> Line 6729: + /* obtain namespace */
>
> ------
>
> 3. Grammar/Spelling
>
> 3a. - format_type_detailed
>
> + * - nspid is the schema OID. For certain SQL-standard types which have weird
> + * typmod rules, we return InvalidOid; caller is expected to not schema-
> + * qualify the name nor add quotes to the type name in this case.
>
> "caller" -> "the caller"
>
> ~
>
> 3b. - format_type_detailed
>
> + * - typemodstr is set to the typemod, if any, as a string with parens
>
> "parens" -> "parentheses"
>
> ~
>
> 3c. - format_type_detailed
>
> + else
> + /* otherwise, WITH TZ is added by typmode. */
> + *typname = pstrdup("TIME");
>
> "typmode" -> "typmod" ?
>
> ~
>
> 3d. - new_objtree_for_qualname
>
> + * A helper routine to setup %{}D and %{}O elements.
>
> "setup" -> "set up"
>
> ~
>
> 3e. - new_objtree_for_type
>
> +/*
> + * A helper routine to setup %{}T elements.
> + */
> +static ObjTree *
> +new_objtree_for_type(Oid typeId, int32 typmod)
>
> "setup" -> "set up"
>
> ~
>
> 3f. - pg_get_indexdef_detailed
> +/*
> + * Return an index definition, split in several pieces.
> + *
> + * A large amount of code is duplicated from pg_get_indexdef_worker, but
> + * control flow is different enough that it doesn't seem worth keeping them
> + * together.
> + */
> +static void
> +pg_get_indexdef_detailed(Oid indexrelid,
>
> "split in" -> "split into"
>
> ~
>
> 3g. - ddl_deparse_to_json
>
> + * The command is expanded fully, so that there are no ambiguities even in the
> + * face of search_path changes.
> + */
> +Datum
> +ddl_deparse_to_json(PG_FUNCTION_ARGS)
>
> "fully, so" -> "fully so"
>
> ~
>
> 3h. -deparse_AlterFunction
>
> + * Given a function OID and the parsetree that created it, return the JSON
> + * blob representing the alter command.
> + */
>
> "the parsetree" -> "the parse tree"
>
> ~
>
> 3i. - deparse_AlterObjectSchemaStmt
>
> +/*
> + * deparse an ALTER ... SET SCHEMA command.
> + */
> +static ObjTree *
> +deparse_AlterObjectSchemaStmt(ObjectAddress address, Node *parsetree,
>
> "deparse" -> "Deparse"
>
> ~
>
> 3j. deparse_AlterObjectSchemaStmt
>
> + /*
> + * Since the command has already taken place from the point of view of
> + * catalogs, getObjectIdentity returns the object name with the already
> + * changed schema. The output of our deparsing must return the original
> + * schema name however, so we chop the schema name off the identity string
> + * and then prepend the quoted schema name.
>
> "name however," -> "name, however,"
>
> ~
>
> 3k. - deparse_GrantStmt
>
> + /* build list of privileges to grant/revoke */
> + if (istmt->all_privs)
>
> "build list" -> "build a list"
>
> ~
>
> 3l. - deparse_AlterTypeSetStmt
>
> + * Deparse an AlterTypeStmt.
> + *
> + * Given a type OID and a parsetree that modified it, return an ObjTree
> + * representing the alter type.
> + */
> +static ObjTree *
> +deparse_AlterTypeSetStmt(Oid objectId, Node *cmd)
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3m. - deparse_CompositeTypeStmt
>
> + * Deparse a CompositeTypeStmt (CREATE TYPE AS)
> + *
> + * Given a type OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3n. - deparse_CreateEnumStmt
>
> +/*
> + * Deparse a CreateEnumStmt (CREATE TYPE AS ENUM)
> + *
> + * Given a type OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3o. - deparse_CreateExtensionStmt
>
> +/*
> + * deparse_CreateExtensionStmt
> + * deparse a CreateExtensionStmt
> + *
> + * Given an extension OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + *
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3p. - deparse_CreateFdwStmt
>
> +/*
> + * deparse_CreateFdwStmt
> + * Deparse a CreateFdwStmt (CREATE FOREIGN DATA WRAPPER)
> + *
> + * Given a trigger OID and the parsetree that created it,
> + * return an ObjTree representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3q. - deparse_AlterFdwStmt
>
> +/*
> + * deparse_AlterFdwStmt
> + * Deparse an AlterFdwStmt (ALTER FOREIGN DATA WRAPPER)
> + *
> + * Given a function OID and the parsetree that create it, return the
> + * JSON blob representing the alter command.
> + */
>
> "parsetree" -> "parse tree"
>
> "create it" -> "created it"
>
> 3r.
>
> + /*
> + * Iterate through options, to see what changed, but use catalog as basis
> + * for new values.
> + */
>
> "as basis" -> "as a basis"
>
> ~
>
> 3s.
>
> +/*
> + * Deparse a CREATE TYPE AS RANGE statement
> + *
> + * Given a type OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3t. - deparse_AlterEnumStmt
>
> +/*
> + * Deparse an AlterEnumStmt.
> + *
> + * Given a type OID and a parsetree that modified it, return an ObjTree
> + * representing the alter type.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3u. - deparse_AlterTableStmt
>
> + /*
> + * We don't support replicating ALTER TABLE which contains volatile
> + * functions because It's possible the functions contain DDL/DML in
> + * which case these opertions will be executed twice and cause
> + * duplicate data. In addition, we don't know whether the tables being
> + * accessed by these DDL/DML are published or not. So blindly allowing
> + * such functions can allow unintended clauses like the tables accessed
> + * in those functions may not even exist on the subscriber-side.
> + */
>
> "opertions" -> "operations"
>
> "subscriber-side." -> "subscriber."
>
> ~
>
> 3v. - deparse_ColumnDef
>
> + * Deparse a ColumnDef node within a regular (non typed) table creation.
>
> "non typed" -> "non-typed"
>
> ~
>
> 3w. - deparse_ColumnDef
>
> + /*
> + * Emit a NOT NULL declaration if necessary. Note that we cannot trust
> + * pg_attribute.attnotnull here, because that bit is also set when
> + * primary keys are specified; and we must not emit a NOT NULL
> + * constraint in that case, unless explicitely specified. Therefore,
> + * we scan the list of constraints attached to this column to determine
> + * whether we need to emit anything.
> + * (Fortunately, NOT NULL constraints cannot be table constraints.)
> + *
> + * In the ALTER TABLE cases, we also add a NOT NULL if the colDef is
> + * marked is_not_null.
> + */
>
> "specified; and we" -> "specified; we"
>
> "explicitely" -> "explicitly"
>
>
> ~
>
> 3x. - deparse_CreateDomain
>
> +/*
> + * Deparse the CREATE DOMAIN
> + *
> + * Given a function OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3y. - deparse_CreateFunction
>
> +/*
> + * Deparse a CreateFunctionStmt (CREATE FUNCTION)
> + *
> + * Given a function OID and the parsetree that created it, return the JSON
> + * blob representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 3z. - deparse_CreateFunction
>
> + /*
> + * Now iterate over each parameter in the parsetree to create the
> + * parameters array.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4a. - deparse_CreateFunction
>
> + /*
> + * A PARAM_TABLE parameter indicates end of input arguments; the
> + * following parameters are part of the return type. We ignore them
> + * here, but keep track of the current position in the list so that
> + * we can easily produce the return type below.
> + */
>
> "end" -> "the end"
>
> ~
>
> 4b. - deparse_CreateOpClassStmt
>
> + /* Don't deparse sql commands generated while creating extension */
> + if (cmd->in_extension)
> + return NULL;
>
> "sql" -> "SQL"
>
> ~
>
> 4c. - deparse_CreateOpClassStmt
>
> /*
> + * Add the FAMILY clause; but if it has the same name and namespace as the
> + * opclass, then have it expand to empty, because it would cause a failure
> + * if the opfamily was created internally.
> + */
>
> "clause; " -> "clause, "
>
> "empty," -> "empty"
>
> ~
>
> 4d. - deparse_CreateOpFamily
>
> +/*
> + * Deparse a CreateTrigStmt (CREATE OPERATOR FAMILY)
> + *
> + * Given a trigger OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4e. - deparse_CreateSchemaStmt
>
> +/*
> + * Deparse a CreateSchemaStmt.
> + *
> + * Given a schema OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + *
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4f. - deparse_CreateSeqStmt
>
> +/*
> + * Deparse a CreateSeqStmt.
> + *
> + * Given a sequence OID and the parsetree that create it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> "create it" -> "created it"
>
> ~
>
> 4g. - deparse_CreateStmt
>
> +/*
> + * Deparse a CreateStmt (CREATE TABLE).
> + *
> + * Given a table OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4h.
>
> + /*
> + * Typed tables and partitions use a slightly different format string: we
> + * must not put table_elements with parents directly in the fmt string,
> + * because if there are no options the parens must not be emitted; and
> + * also, typed tables do not allow for inheritance.
> + */
>
> "parens" -> "parentheses"
>
> ~
>
> 4i. - deparse_CreateStmt
>
> + /*
> + * We can't put table elements directly in the fmt string as an array
> + * surrounded by parens here, because an empty clause would cause a
> + * syntax error. Therefore, we use an indirection element and set
> + * present=false when there are no elements.
> + */
>
> "parens" -> "parentheses"
>
> ~
>
> 4j. - deparse_CreateStmt
>
> + /*
> + * Get pg_class.relpartbound. We cannot use partbound in the
> + * parsetree directly as it's the original partbound expression
> + * which haven't been transformed.
> + */
>
> "parsetree" -> "parse tree" ? maybe this one if ok if it referring to
> the parameter with this name.
>
> ~
>
> 4k. - deparse_DefineStmt_Operator
>
> +/*
> + * Deparse a DefineStmt (CREATE OPERATOR)
> + *
> + * Given a trigger OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4l. - deparse_CreateTrigStmt
>
> +/*
> + * Deparse a CreateTrigStmt (CREATE TRIGGER)
> + *
> + * Given a trigger OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + */
>
> "parsetree" -> "parsetree"
>
> ~
>
> 4m. - deparse_RefreshMatViewStmt
>
> +/*
> + * Deparse a RefreshMatViewStmt (REFRESH MATERIALIZED VIEW)
> + *
> + * Given a materialized view OID and the parsetree that created it, return an
> + * ObjTree representing the refresh command.
> + */
>
> "parseree" -> "parse tree"
>
> ~
>
> 4n. - deparse_IndexStmt
>
> +/*
> + * Deparse an IndexStmt.
> + *
> + * Given an index OID and the parsetree that created it, return an ObjTree
> + * representing the creation command.
> + *
> + * If the index corresponds to a constraint, NULL is returned.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4o. - deparse_InhRelations
>
> +/*
> + * Deparse the INHERITS relations.
> + *
> + * Given a table OID, return a schema qualified table list representing
> + * the parent tables.
> + */
> +static List *
> +deparse_InhRelations(Oid objectId)
>
> "schema qualified" -> "schema-qualified"
>
> ~
>
> 4p. - deparse_OnCommitClause
>
> +/*
> + * Deparse the ON COMMMIT ... clause for CREATE ... TEMPORARY ...
> + */
> +static ObjTree *
> +deparse_OnCommitClause(OnCommitAction option)
>
> "COMMMIT" -> "COMMIT"
>
> ~
>
> 4q. - deparse_RuleStmt
>
> +/*
> + * Deparse a RuleStmt (CREATE RULE).
> + *
> + * Given a rule OID and the parsetree that created it, return an ObjTree
> + * representing the rule command.
> + */
>
> "parsetree" -> "parse tree"
>
> ~
>
> 4r. - deparse_utility_command
>
> + /*
> + * Allocate everything done by the deparsing routines into a temp context,
> + * to avoid having to sprinkle them with memory handling code; but allocate
> + * the output StringInfo before switching.
> + */
>
> "code; " -> "code, "
>
> ~
>
> 4s. - deparse_utility_command
>
> + /*
> + * Many routines underlying this one will invoke ruleutils.c functionality
> + * in order to obtain deparsed versions of expressions. In such results,
> + * we want all object names to be qualified, so that results are "portable"
> + * to environments with different search_path settings. Rather than inject
> + * what would be repetitive calls to override search path all over the
> + * place, we do it centrally here.
> + */
>
> "in order to" => "to"
>
> ~
>
> 4t. - convSpecifier
>
> +/*
> + * Conversion specifier, it determines how we expand the json element into
> + * string.
> + */
>
> SUGGESTION
> Conversion specifier which determines how we expand the json element
> into a string.
>
> ~
>
> 4u. - json_trivalue
>
> +/*
> + * A ternary value which represents a boolean type JsonbValue.
> + */
>
> "which represents" -> "that represents"
>
> ~
>
> 4v. - expand_fmt_recursive
>
> + /*
> + * Scan the mandatory element name. Allow for an array separator
> + * (which may be the empty string) to be specified after colon.
> + */
>
> "after colon" -> "after a colon"
>
> ~
>
> 4w. - expand_jsonval_string
>
> +/*
> + * Expand a json value as a string. The value must be of type string or of
> + * type object. In the latter case it must contain a "fmt" element which will
> + * be recursively expanded; also, if the object contains an element "present"
> + * and it is set to false, the expansion is the empty string.
> + *
> + * Returns false if no actual expansion was made due to the "present" flag
> + * being set to "false".
> + */
>
> "latter case" -> "latter case,"
>
> ~
>
> 4x. - format_procedure_args_internal
>
> +/*
> + * Append the parenthised arguments of the given pg_proc row into the output
> + * buffer. force_qualify indicates whether to schema-qualify type names
> + * regardless of visibility.
> + */
>
> "parenthised" -> "parenthesized "
>
> ~
>
> 4y.
>
> +/*
> + * Internal version that returns definition of a CONSTRAINT command
> + */
>
> "definition" -> "the definition"
>
> ======
>
> 5. Function comment inconsistencies
>
> 5a.
>
> Sometimes the function name is repeated in the comment and sometimes it is not.
>
> e.g. compare deparse_CreateEnumStmt() versus deparse_CreateExtensionStmt(), etc.
>
> (IMO there is no need to repeat the function name)
>
> ~
>
> 5b.
>
> Sometimes the deparse function comments are verbose and say like:
>
> + * Given a type OID and a parsetree that modified it, return an ObjTree
> + * representing the alter type.
>
> but sometimes - like deparse_AlterExtensionStmt() etc. - they don't
> bother to say anything at all.
>
> e.g.
> +/*
> + * Deparse ALTER EXTENSION .. UPDATE TO VERSION
> + */
> +static ObjTree *
> +deparse_AlterExtensionStmt(Oid objectId, Node *parsetree)
>
> Either it is useful, so say it always, or it is not useful, so say it
> never. Pick one.
>
> ------
>
> 6. GENERAL - json
>
> IMO change "json" -> "JSON" everywhere.
>
> Here are some examples:
>
> Line 7605: + * Conversion specifier, it determines how we expand the
> json element into
> Line 7699: + errmsg("missing element \"%s\" in json object", keyname)));
> Line 7857: + * Expand a json value as a quoted identifier. The value
> must be of type string.
> Line 7872: + * Expand a json value as a dot-separated-name. The value
> must be of type
> Line 7908: + * Expand a json value as a type name.
> Line 7966: + * Expand a json value as an operator name
> Line 7993: + * Expand a json value as a string. The value must be of
> type string or of
> Line 8031: + * Expand a json value as a string literal.
> Line 8070: + * Expand a json value as an integer quantity.
> Line 8083: + * Expand a json value as a role name. If the is_public
> element is set to
> Line 8111: + * Expand one json element into the output StringInfo
> according to the
> Line 8807: +{ oid => '4642', descr => 'deparse the DDL command into
> json format string',
> Line 8810: +{ oid => '4643', descr => 'expand json format DDL to a
> plain DDL command',
>
> ------

I've addressed all the above comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v28-0003-Support-CREATE-TABLE-AS-SELECT-INTO.patch application/octet-stream 15.0 KB
v28-0002-Support-DDL-replication.patch application/octet-stream 131.7 KB
v28-0004-Test-cases-for-DDL-replication.patch application/octet-stream 24.6 KB
v28-0001-Functions-to-deparse-DDL-commands.patch application/octet-stream 293.4 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2022-10-13 16:20:51 Exponentiation confusion
Previous Message Rita 2022-10-13 10:41:51 Re: recovery.conf and archive files

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2022-10-13 13:38:06 Re: Tracking last scan time
Previous Message David Turoň 2022-10-13 12:40:34 PG upgrade 14->15 fails - database contains our own extension