RE: Support logical replication of DDLs

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2023-04-19 13:55:11
Message-ID: OS0PR01MB5716CBA51E912315468C320894629@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, Apr 17, 2023 18:32 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> On Sat, 15 Apr 2023 at 06:38, vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Fri, 14 Apr 2023 at 13:06, vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > Few comments:

Thanks for your comments.
Improved the patch set according to your comments. Please refer to the details below.

===
About the comments in [1]:

> 1) I felt is_present_flag variable can be removed by moving
> "object_name = append_object_to_format_string(tree, sub_fmt);" inside
> the if condition:
> +static void
> +append_bool_object(ObjTree *tree, char *sub_fmt, bool value) {
> + ObjElem *param;
> + char *object_name = sub_fmt;
> + bool is_present_flag = false;
> +
> + Assert(sub_fmt);
> +
> + /*
> + * Check if the format string is 'present' and if yes, store the boolean
> + * value
> + */
> + if (strcmp(sub_fmt, "present") == 0)
> + {
> + is_present_flag = true;
> + tree->present = value;
> + }
> +
> + if (!is_present_flag)
> + object_name = append_object_to_format_string(tree,
> + sub_fmt);
> +
> + param = new_object(ObjTypeBool, object_name);
> + param->value.boolean = value;
> + append_premade_object(tree, param); }
>
> By changing it to something like below:
> +static void
> +append_bool_object(ObjTree *tree, char *sub_fmt, bool value) {
> + ObjElem *param;
> + char *object_name = sub_fmt;
> +
> + Assert(sub_fmt);
> +
> + /*
> + * Check if the format string is 'present' and if yes, store the boolean
> + * value
> + */
> + if (strcmp(sub_fmt, "present") == 0)
> + {
> + tree->present = value;
> + object_name = append_object_to_format_string(tree, sub_fmt);
> + }
> +
> + param = new_object(ObjTypeBool, object_name);
> + param->value.boolean = value;
> + append_premade_object(tree, param); }

Changed.

> 2) We could remove the temporary variable tmp_str here:
> + if (start_ptr != NULL && end_ptr != NULL)
> + {
> + length = end_ptr - start_ptr - 1;
> + tmp_str = (char *) palloc(length + 1);
> + strncpy(tmp_str, start_ptr + 1, length);
> + tmp_str[length] = '\0';
> + appendStringInfoString(&object_name, tmp_str);
> + pfree(tmp_str);
> + }
>
> by changing to:
> + if (start_ptr != NULL && end_ptr != NULL)
> + appendBinaryStringInfo(&object_name, start_ptr + 1,
> end_ptr - start_ptr - 1);

Changed.

> 3) I did not see the usage of ObjTypeFloat type used anywhere, we
> could remove it:
> +typedef enum
> +{
> + ObjTypeNull,
> + ObjTypeBool,
> + ObjTypeString,
> + ObjTypeArray,
> + ObjTypeInteger,
> + ObjTypeFloat,
> + ObjTypeObject
> +} ObjType;

Removed.

> 4) I noticed that none of the file names in src/backend/commands uses
> "_" in the filenames, but in case of ddl_deparse.c and ddl_json.c we
> have used "_", it might be better to be consistent with other
> filenames in this directory:
>
> diff --git a/src/backend/commands/Makefile
> b/src/backend/commands/Makefile index 48f7348f91..171dfb2800 100644
> --- a/src/backend/commands/Makefile
> +++ b/src/backend/commands/Makefile
> @@ -29,6 +29,8 @@ OBJS = \
> copyto.o \
> createas.o \
> dbcommands.o \
> + ddl_deparse.o \
> + ddl_json.o \
> define.o \
> discard.o \
> dropcmds.o \

Changed.

> 5) The following includes are no more required in ddl_deparse.c as we
> have removed support for deparsing of other objects:
> #include "catalog/pg_am.h"
> #include "catalog/pg_aggregate.h"
> #include "catalog/pg_authid.h"
> #include "catalog/pg_cast.h"
> #include "catalog/pg_conversion.h"
> #include "catalog/pg_depend.h"
> #include "catalog/pg_extension.h"
> #include "catalog/pg_foreign_data_wrapper.h"
> #include "catalog/pg_foreign_server.h"
> #include "catalog/pg_language.h"
> #include "catalog/pg_largeobject.h"
> #include "catalog/pg_opclass.h"
> #include "catalog/pg_operator.h"
> #include "catalog/pg_opfamily.h"
> #include "catalog/pg_policy.h"
> #include "catalog/pg_range.h"
> #include "catalog/pg_rewrite.h"
> #include "catalog/pg_sequence.h"
> #include "catalog/pg_statistic_ext.h"
> #include "catalog/pg_transform.h"
> #include "catalog/pg_ts_config.h"
> #include "catalog/pg_ts_dict.h"
> #include "catalog/pg_ts_parser.h"
> #include "catalog/pg_ts_template.h"
> #include "catalog/pg_user_mapping.h"
> #include "foreign/foreign.h"
> #include "mb/pg_wchar.h"
> #include "nodes/nodeFuncs.h"
> #include "nodes/parsenodes.h"
> #include "parser/parse_type.h"

Removed.

===
About the comments in [2]:

> 1) We could add a space after the 2nd parameter
> + * Note we don't have the luxury of sprintf-like compiler warnings
> +for
> + * malformed argument lists.
> + */
> +static ObjTree *
> +new_objtree_VA(char *fmt, int numobjs,...)

Changed.

> 2) I felt objtree_to_jsonb_element is a helper function for
> objtree_to_jsonb_rec, we should update the comments accordingly:
> +/*
> + * Helper for objtree_to_jsonb: process an individual element from an
> +object or
> + * an array into the output parse state.
> + */
> +static void
> +objtree_to_jsonb_element(JsonbParseState *state, ObjElem *object,
> + JsonbIteratorToken
> +elem_token) {
> + JsonbValue val;
> +
> + switch (object->objtype)
> + {
> + case ObjTypeNull:
> + val.type = jbvNull;
> + pushJsonbValue(&state, elem_token, &val);
> + break;

Changed.

> 3) domainId parameter change should be removed from the first patch:
> +static List *
> +obtainConstraints(List *elements, Oid relationId, Oid domainId,
> + ConstraintObjType objType) {
> + Relation conRel;
> + ScanKeyData key;
> + SysScanDesc scan;
> + HeapTuple tuple;
> + ObjTree *constr;
> + Oid relid;
> +
> + /* Only one may be valid */
> + Assert(OidIsValid(relationId) ^ OidIsValid(domainId));
> +
> + relid = OidIsValid(relationId) ? ConstraintRelidTypidNameIndexId :
> + ConstraintTypidIndexId;

Removed in the last version.

> 4) Do we have any scenario for CONSTRAINT_TRIGGER in table/index, if
> so could we add a test for this?
> + case CONSTRAINT_UNIQUE:
> + contype = "unique";
> + break;
> + case CONSTRAINT_TRIGGER:
> + contype = "trigger";
> + break;
> + case CONSTRAINT_EXCLUSION:
> + contype = "exclusion";
> + break;

No. Moved from 0001 patch to later patch (the deparser for the rest commands).

> 5) The below code adds information about compression but the comment
> says "USING clause", the comment should be updated accordingly:
> + /* USING clause */
> + tmp_obj = new_objtree("COMPRESSION");
> + if (coldef->compression)
> + append_string_object(tmp_obj,
> + "%{compression_method}I",
> +
> "compression_method", coldef->compression);
> + else
> + {
> + append_null_object(tmp_obj, "%{compression_method}I");
> + append_not_present(tmp_obj);
> + }
> + append_object_object(ret, "%{compression}s", tmp_obj);

Changed.

> 6) Generally we add append_null_object followed by append_not_present,
> but it is not present for "COLLATE" handling, is this correct?
> + tmp_obj = new_objtree("COMPRESSION");
> + if (coldef->compression)
> + append_string_object(tmp_obj,
> + "%{compression_method}I",
> +
> "compression_method", coldef->compression);
> + else
> + {
> + append_null_object(tmp_obj, "%{compression_method}I");
> + append_not_present(tmp_obj);
> + }
> + append_object_object(ret, "%{compression}s", tmp_obj);
> +
> + tmp_obj = new_objtree("COLLATE");
> + if (OidIsValid(typcollation))
> + append_object_object(tmp_obj, "%{name}D",
> +
> new_objtree_for_qualname_id(CollationRelationId,
> +
> typcollation));
> + else
> + append_not_present(tmp_obj);
> + append_object_object(ret, "%{collation}s", tmp_obj);

Changed.

> 7) I felt attrTup can be released after get_atttypetypmodcoll as we
> are not using the tuple after that, I'm not sure if it is required to
> hold the reference to this tuple till the end of the function:
> +static ObjTree *
> +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef
> *coldef)
> +{
> + ObjTree *ret = NULL;
> + ObjTree *tmp_obj;
> + Oid relid = RelationGetRelid(relation);
> + HeapTuple attrTup;
> + Form_pg_attribute attrForm;
> + Oid typid;
> + int32 typmod;
> + Oid typcollation;
> + bool saw_notnull;
> + ListCell *cell;
> +
> + attrTup = SearchSysCacheAttName(relid, coldef->colname);
> + if (!HeapTupleIsValid(attrTup))
> + elog(ERROR, "could not find cache entry for column
> \"%s\" of relation %u",
> + coldef->colname, relid);
> + attrForm = (Form_pg_attribute) GETSTRUCT(attrTup);
> +
> + get_atttypetypmodcoll(relid, attrForm->attnum,
> + &typid, &typmod,
> &typcollation);
> +
> + /*
> + * Search for a NOT NULL declaration. As in deparse_ColumnDef,
> we rely on
> + * finding a constraint on the column rather than coldef->is_not_null.
> + * (This routine is never used for ALTER cases.)
> + */
> + saw_notnull = false;
> + foreach(cell, coldef->constraints)
> + {
> + Constraint *constr = (Constraint *) lfirst(cell);
> +
> + if (constr->contype == CONSTR_NOTNULL)
> + {
> + saw_notnull = true;
> + break;
> + }
> + }

'attrTup' can not be released earlier as it is used till the end in the form of 'attrForm'.

> 8) This looks like ALTER TABLE ... SET/RESET, the function header
> should be updated accordingly:
> /*
> * ... ALTER COLUMN ... SET/RESET (...)
> *
> * Verbose syntax
> * RESET|SET (%{options:, }s)
> */
> static ObjTree *
> deparse_RelSetOptions(AlterTableCmd *subcmd) {
> List *sets = NIL;
> ListCell *cell;
> bool is_reset = subcmd->subtype == AT_ResetRelOptions;

Changed.

> 9) Since we don't replicate temporary tables, is this required:
> +/*
> + * Deparse the ON COMMIT ... clause for CREATE ... TEMPORARY ...
> + *
> + * Verbose syntax
> + * ON COMMIT %{on_commit_value}s
> + */
> +static ObjTree *
> +deparse_OnCommitClause(OnCommitAction option) {
> + ObjTree *ret = new_objtree("ON COMMIT");
> + switch (option)

I will consider this in the next version.

> 10) Since we don't support MATERIALIZED VIEW, VIEW and FOREIGN TABLE,
> they can be removed:
> + switch (rel->rd_rel->relkind)
> + {
> + case RELKIND_RELATION:
> + case RELKIND_PARTITIONED_TABLE:
> + reltype = "TABLE";
> + break;
> + case RELKIND_INDEX:
> + case RELKIND_PARTITIONED_INDEX:
> + reltype = "INDEX";
> + break;
> + case RELKIND_VIEW:
> + reltype = "VIEW";
> + break;
> + case RELKIND_COMPOSITE_TYPE:
> + reltype = "TYPE";
> + istype = true;
> + break;
> + case RELKIND_FOREIGN_TABLE:
> + reltype = "FOREIGN TABLE";
> + break;
> + case RELKIND_MATVIEW:
> + reltype = "MATERIALIZED VIEW";
> + break;

The code is changed in a way that these are no-op and return NULL instead of returning a valid value. I think this should suffice.
Please let me know if you think these must be removed.

===
About the comments in [3]:

> 1) since missing_ok is passed as false, if there is an error the error
> will be handled in find_string_in_jsonbcontainer, "missing operator
> name" handling can be removed from here:
> +/*
> + * Expand a JSON value as an operator name. The value may contain
> +element
> + * "schemaname" (optional).
> + */
> +static void
> +expand_jsonval_operator(StringInfo buf, JsonbValue *jsonval) {
> + char *str;
> + JsonbContainer *data = jsonval->val.binary.data;
> +
> + str = find_string_in_jsonbcontainer(data, "schemaname", true, NULL);
> + /* Schema might be NULL or empty */
> + if (str != NULL && str[0] != '\0')
> + {
> + appendStringInfo(buf, "%s.", quote_identifier(str));
> + pfree(str);
> + }
> +
> + str = find_string_in_jsonbcontainer(data, "objname", false, NULL);
> + if (!str)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("missing operator name"));
> +

Removed.

> 2) This should be present at the beginning of the file before the functions:
> +#define ADVANCE_PARSE_POINTER(ptr,end_ptr) \
> + do { \
> + if (++(ptr) >= (end_ptr)) \
> + ereport(ERROR, \
> +
> errcode(ERRCODE_INVALID_PARAMETER_VALUE), \
> + errmsg("unterminated format
> specifier")); \
> + } while (0)
> +

Changed.

> 3) Should we add this to the documentation, we have documented other
> event triggers like ddl_command_start, ddl_command_end, table_rewrite
> and sql_drop at [1]:
> + runlist = EventTriggerCommonSetup(command->parsetree,
> +
> EVT_TableInitWrite,
> +
> "table_init_write",
> +
> &trigdata);

Added.

> 4) The inclusion of stringinfo.h is not required, I was able to
> compile the code without it:
> + * src/backend/commands/ddl_json.c
> + *
> +
> +*--------------------------------------------------------------------
> +-----
> + */
> +#include "postgres.h"
> +
> +#include "lib/stringinfo.h"
> +#include "tcop/ddl_deparse.h"
> +#include "utils/builtins.h"
> +#include "utils/jsonb.h"

Removed.

> 5) schema and typmodstr might not be allocated, should we still call pfree?
> + schema = find_string_in_jsonbcontainer(data, "schemaname", true, NULL);
> + typename = find_string_in_jsonbcontainer(data, "typename", false, NULL);
> + typmodstr = find_string_in_jsonbcontainer(data, "typmod", true, NULL);
> + is_array = find_bool_in_jsonbcontainer(data, "typarray");
> + switch (is_array)
> + {
> + case tv_true:
> + array_decor = "[]";
> + break;
> +
> + case tv_false:
> + array_decor = "";
> + break;
> +
> + case tv_absent:
> + default:
> + ereport(ERROR,
> +
> errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("missing typarray element"));
> + }
> +
> + if (schema == NULL)
> + appendStringInfo(buf, "%s", quote_identifier(typename));
> + else if (schema[0] == '\0')
> + appendStringInfo(buf, "%s", typename); /* Special
> typmod needs */
> + else
> + appendStringInfo(buf, "%s.%s", quote_identifier(schema),
> +
> + quote_identifier(typename));
> +
> + appendStringInfo(buf, "%s%s", typmodstr ? typmodstr : "", array_decor);
> + pfree(schema);
> + pfree(typename);
> + pfree(typmodstr);

Changed code to do NULL initialization in the begining and call pfree only if these are allocated.

> 6) SHould the following from ddl_deparse_expand_command function
> header be moved to expand_one_jsonb_element function header, as the
> specified are being handled in expand_one_jsonb_element.
> * % expand to a literal %
> * I expand as a single, non-qualified identifier
> * D expand as a possibly-qualified identifier
> * T expand as a type name
> * O expand as an operator name
> * L expand as a string literal (quote using single quotes)
> * s expand as a simple string (no quoting)
> * n expand as a simple number (no quoting)
> * R expand as a role name (possibly quoted name, or PUBLIC)

It seems more suitable here as this is the exposed function
and thus complete details here make sense. But I have changed comment little
bit to indicate that actual conversion work happens in expand_one_jsonb_element().
Please let me know if you still think it is better to move the comments.

> 7) In ddl_deparse.c we have used elog(ERROR, ...) for error handling
> and in ddl_json.c we have used ereport(ERROR, ...) for error handling,
> Is this required for any special handling?

I checked occurences of elog in ddl_deparse.c. That looked appropriate to me
as they are internal errors which we usually don't expect user to encounter.
In fact I changed few of ereport added earlier by me to elog to be consistent.
The occurences of ereport in ddl_json.c also looks fine to me.
Please let me know if you want any specific error handling to be changed.

> 8) Is this required as part of create table implementation:
> 8.a)
> +/*
> + * EventTriggerAlterTypeStart
> + * Save data about a single part of an ALTER TYPE.
> + *
> + * ALTER TABLE can have multiple subcommands which might include DROP
> COLUMN
> + * command and ALTER TYPE referring the drop column in USING expression.
> + * As the dropped column cannot be accessed after the execution of
> + DROP
> COLUMN,
> + * a special trigger is required to handle this case before the drop
> +column is
> + * executed.
> + */
> +void
> +EventTriggerAlterTypeStart(AlterTableCmd *subcmd, Relation rel) {
>
> 8.b)
> +/*
> + * EventTriggerAlterTypeEnd
> + * Finish up saving an ALTER TYPE command, and add it to
> command list.
> + */
> +void
> +EventTriggerAlterTypeEnd(Node *subcmd, ObjectAddress address, bool
> +rewrite)

It is used to store the using expression of alter column type command,
and the using expression(usingexpr) is used in the deparser. So I think it
cannot be removed/moved to other patches.

> 9) Since we need only the table and index related implementation in
> 0001, the rest can be moved to a different patch accordingly:
> +/*
> + * Return the given object type as a string.
> + *
> + * If isgrant is true, then this function is called while deparsing
> +GRANT
> + * statement and some object names are replaced.
> + */
> +const char *
> +stringify_objtype(ObjectType objtype, bool isgrant) {
> + switch (objtype)
> + {
> + case OBJECT_AGGREGATE:
> + return "AGGREGATE";
> + case OBJECT_CAST:
> + return "CAST";
> + case OBJECT_COLLATION:
> + return "COLLATION";
> + case OBJECT_COLUMN:
> + return isgrant ? "TABLE" : "COLUMN";

Changed.

> 10) json_trivalue should be added to typedefs:
> +typedef enum
> +{
> + tv_absent,
> + tv_true,
> + tv_false
> +} json_trivalue;

Added.

Attach the new patch set and thanks Shveta for helping
address the above comments.

Apart from above comments.
The new version patch also did the following changes:

- Fixed the cfbot warnings about the header file.
- Added documents for ALTER TABLE rewrite restrictions.
- Fixed the crash when applying create index concurrently.
- Used the consistent API as DML apply when switch the current role
to the another before applying.

[1] - https://www.postgresql.org/message-id/CALDaNm1aTHyeMfmkyunq%3DHZ6dyOJNqgszhmsLkeVMEgWfJ8frA%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm1RnYRfzsL4GfznU4zuoPMkgnAAM8Ons3kCtLr2Tdzoow%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CALDaNm1NeyTrCDizXHvUqhbOdH2%3D%2Bf%3DR8aX3r0AbDr7rRJeQAA%40mail.gmail.com

Best Regards,
Hou zj

Attachment Content-Type Size
ddl-replication-2023_04_19.tar.gz application/x-gzip 158.6 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Erik Wienhold 2023-04-19 14:51:33 Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
Previous Message gzh 2023-04-19 13:24:09 psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-04-19 13:58:07 Re: Remove references to pre-11 versions
Previous Message Robert Haas 2023-04-19 13:24:14 Re: Request for comment on setting binary format output per session