RE: Allow logical replication to copy tables in binary format

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Melih Mutlu' <m(dot)melihmutlu(at)gmail(dot)com>
Cc: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: RE: Allow logical replication to copy tables in binary format
Date: 2023-02-22 07:01:28
Message-ID: TYAPR01MB5866968CF42FBAB73E8EEDF8F5AA9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Melih,

Thank you for updating the patch! Followings are my comments.

01. catalogs.sgml

```
If true, the subscription will request that the publisher send data
- in binary format
```

I'm not clear here. subbinary does not directly mean that whether the worker
requests to send data or not. How about:

If true, the subscription will request that the publisher send data in binary
format, except initial data synchronization

02. create_subscription.sgml

```
+ the binary format is very data type specific, it will not allow copying
+ between different column types as opposed to text format. Note that
```

The name of formats are not specified as <literal>, whereas in previous sentence
they are. We can use format either of them.

03. parse_subscription_options()

I'm not sure the combination of "copy_format = binary" and "copy_data = false"
should be accepted or not. How do you think?

04. parse_subscription_options()

```
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "binary = false", "copy_format = binary")));
+ }
```

A comment for translator seemed to be missed.

05. CreateSubscription()

```
values[Anum_pg_subscription_suborigin - 1] =
CStringGetTextDatum(opts.origin);
+ values[Anum_pg_subscription_subcopyformat - 1] = CharGetDatum(opts.copy_format);
```

I think this should be done the same ordering with FormData_pg_subscription.
Maybe after the line?

```
values[Anum_pg_subscription_subdisableonerr - 1] = BoolGetDatum(opts.disableonerr);
```

06. AlterSubscription()

If we decided not to accept "copy_format = binary" and "copy_data = false", here
should be also fixed.

07. AlterSubscription()

```
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for a subscription with %s",
+ "binary = false", "copy_format = binary")));
...
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for a subscription with %s",
+ "copy_format = binary", "binary = false")));
```

Comments for translator seemed to be missed.

08. defGetCopyFormat()

```
+ /*
+ * If no parameter value given, set it to text format.
+ */
+ if (!def->arg)
+ return LOGICALREP_COPY_AS_TEXT;
```

I think the case no parameter is given should be rejected. It should be accepted
only when the parameter has boolean data type. Note that defGetStreamingMode()
is accepted no-parameter style due to the compatibility. At first streaming is
boolean, and then "parallel" is added.

09. describeSubscriptions

```
+ /* Copy format is only supported in v16 and higher */
```

I think this comment should be atop of if statement, and it can mention about Origin too.

10. pg_subscription.h

```
+ char subcopyformat BKI_DEFAULT(LOGICALREP_COPY_AS_TEXT); /* Copy format *
```

I'm not sure whether BKI_DEFAULT() is needed or not. Other options like twophase
does not have the default value as catalog level. The default is set in
parse_subscription_options() and then the value will be set to catalog.

11. typedef struct Subscription

In catalog entry the subcopyformat is aligned just after subdisableonerr, but in
struct Subscription, copyformat is added at the end. Can we place just after disableonerr?

12. Reply

> Since binary copy relies on COPY command, we may have problems across
> different server versions in cases where COPY is not portable.
> What I understand from this [1], COPY works across server versions later
> than 7.4. This shouldn't be a problem for logical replication.
> Currently the patch also does not allow binary copy if the publisher
> version is older than 16.

If in future version the general data type is added, the copy command in binary
format will not work well, right? It is because the inferior version does not have
recv/send functions for added type. It means that there is a possibility that
replication between different versions may be failed if binary format is specified.
Therefore, I think we should accept copy_format = binary only when the major
version of servers are the same.

Note that this comments is not the request to the patch. Maybe the modification
should be done not only for copy_format but also binary, and it may be out of scope
the patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Samokhvalov 2023-02-22 07:05:38 Re: [PATCH] Add pretty-printed XML output option
Previous Message Andrey Chudnovsky 2023-02-22 07:00:46 Re: [PoC] Federated Authn/z with OAUTHBEARER