RE: Support logical replication of DDLs

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: RE: Support logical replication of DDLs
Date: 2022-07-05 08:28:05
Message-ID: OS0PR01MB5716D3B79A607B547F2AB90D94819@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wednesday, June 29, 2022 6:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jun 29, 2022 at 3:24 PM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Wednesday, June 29, 2022 11:07 AM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila
> > > <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > >
> > > > 5.
> > > > +static ObjTree *
> > > > +deparse_CreateStmt(Oid objectId, Node *parsetree)
> > > > {
> > > > ...
> > > > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); if
> > > > + (node->tablespacename) append_string_object(tmp, "tablespace",
> > > > + node->tablespacename); else { append_null_object(tmp,
> > > > + node->"tablespace");
> > > > + append_bool_object(tmp, "present", false); }
> > > > + append_object_object(createStmt, "tablespace", tmp);
> > > > ...
> > > > }
> > > >
> > > > Why do we need to append the objects (tablespace, with clause,
> > > > etc.) when they are not present in the actual CREATE TABLE
> > > > statement? The reason to ask this is that this makes the string
> > > > that we want to send downstream much longer than the actual
> > > > statement given by the user on the publisher.
> > > >
> > >
> > > After thinking some more on this, it seems the user may want to
> > > optionally change some of these attributes, for example, on the
> > > subscriber, it may want to associate the table with a different
> > > tablespace. I think to address that we can append these additional
> > > attributes optionally, say via an additional parameter
> > > (append_all_options/append_all_attributes or something like that) in
> exposed APIs like deparse_utility_command().
> >
> > I agree and will research this part.
> >
>
> Okay, note that similar handling would be required at other places like
> deparse_ColumnDef. Few other comments on
> v9-0001-Functions-to-deparse-DDL-commands.
>
> 1.
> +static ObjElem *new_bool_object(bool value); static ObjElem
> +*new_string_object(char *value); static ObjElem
> +*new_object_object(ObjTree *value); static ObjElem
> +*new_array_object(List *array); static ObjElem
> +*new_integer_object(int64 value); static ObjElem
> +*new_float_object(float8 value);
>
> Here, new_object_object() seems to be declared out-of-order (not in sync with
> its order of definition). Similarly, see all other append_* functions.

Changed.

> 2. The function printTypmod in ddl_deparse.c appears to be the same as the
> function with the same name in format_type.c. If so, isn't it better to have a
> single copy of that function?

Changed.

> 3. As I pointed out yesterday, there is some similarity between
> format_type_extended and format_type_detailed. Can we try to extract
> common functionality? If this is feasible, it is better to do this as a separate
> patch. Also, this can obviate the need to expose printTypmod from
> format_type.c. I am not really sure if this will be better than the current one or
> not but it seems worth trying.

It seems the main logic in the two functions is a bit different. For
format_type_detailed, we always try to schema qualify both the user-defined and
built-in type except for some special type(time ..) which we cannot schema
qualify. But in format_type_extended, we don't try to schema qualify the
built-in type. But I will research a bit more about this.

> 4.
> new_objtree_VA()
> {
> ...
> switch (type)
> + {
> + case ObjTypeBool:
> + elem = new_bool_object(va_arg(args, int)); break; case ObjTypeString:
> + elem = new_string_object(va_arg(args, char *)); break; case
> + ObjTypeObject:
> + elem = new_object_object(va_arg(args, ObjTree *)); break; case
> + ObjTypeArray:
> + elem = new_array_object(va_arg(args, List *)); break; case
> + ObjTypeInteger:
> + elem = new_integer_object(va_arg(args, int64)); break; case
> + ObjTypeFloat:
> + elem = new_float_object(va_arg(args, double)); break; case
> + ObjTypeNull:
> + /* Null params don't have a value (obviously) */ elem =
> + new_null_object();
> ...
>
> I feel here ObjType's should be handled in the same order as they are defined in
> ObjType.

Changed.

> 5. There appears to be common code among node_*_object() functions.
> Can we just have one function instead new_object(ObjType, void *val)?
> In the function based on type, we can typecast the value. Is there a reason why
> that won't be better than current one?

I tried to extract some common code into a new function new_object(ObjType,
char *name). I didn't use 'void *' as user could pass a wrong type of value which
we cannot detect which seems a little unsafe.

> 6.
> deparse_ColumnDef()
> {
> ...
> /* Composite types use a slightly simpler format string */
> + if (composite)
> + column = new_objtree_VA("%{name}I %{coltype}T %{collation}s", 3,
> + "type", ObjTypeString, "column", "name", ObjTypeString,
> + coldef->colname, "coltype", ObjTypeObject, new_objtree_for_type(typid,
> + typmod)); else column = new_objtree_VA("%{name}I %{coltype}T
> + %{default}s
> %{not_null}s %{collation}s",
> + 3,
> + "type", ObjTypeString, "column",
> + "name", ObjTypeString, coldef->colname, "coltype", ObjTypeObject,
> + new_objtree_for_type(typid, typmod));
> ...
> }
>
> To avoid using the same arguments, we can define fmtstr for composite and
> non-composite types as the patch is doing in deparse_CreateStmt().

Changed.

> 7. It is not clear from comments or otherwise what should be considered for
> default format string in functions like
> deparse_ColumnDef() or deparse_CreateStmt().

I think it was intended to put most of the modifiable part in default format
string so that user can change what they want using json function. But I think
we could improve, maybe by passing a parameter which decide whether to keep
these keywords in default format string. I will research this.

> 8.
> + * FIXME --- actually, what about default values?
> + */
> +static ObjTree *
> +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef
> +*coldef)
>
> I think we need to handle default values for this FIXME.

Changed.

Attach the new version patch which address the above
comments and comments from [1][2].

I haven't addressed the latest comments from Vignesh[3],
I will investigate it.

[1] https://www.postgresql.org/message-id/CAA4eK1K88SMoBq%3DDRA4XU-F3FG6qyzCjGMMKsPpcRBPRcrELrw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CALDaNm3rEA_zmnDMOCT7NqK4aAffhAgooLf8rXjUN%3DYwA8ASFw%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CALDaNm2nFPMxUo%3D0zRUUA-v3_eRwRY%2Bii5nnG_PU%2B6jT7ta9dA%40mail.gmail.com

Best regards,
Hou zj

Attachment Content-Type Size
v11-0004-Test-cases-for-DDL-replication.patch application/octet-stream 22.5 KB
v11-0001-Functions-to-deparse-DDL-commands.patch application/octet-stream 131.9 KB
v11-0003-Support-CREATE-TABLE-AS-SELECT-INTO.patch application/octet-stream 14.8 KB
v11-0002-Support-DDL-replication.patch application/octet-stream 125.6 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2022-07-05 08:40:40 Re: lifetime of the old CTID
Previous Message Rajesh S 2022-07-05 08:22:32 - operator overloading not giving expected result

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-07-05 08:29:28 Re: Handle infinite recursion in logical replication setup
Previous Message Andres Freund 2022-07-05 08:11:26 Re: [PoC] Improve dead tuple storage for lazy vacuum