Re: Logical Replication and Character encoding

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: peter(dot)eisentraut(at)2ndquadrant(dot)com
Cc: petr(dot)jelinek(at)2ndquadrant(dot)com, noriyoshi(dot)shinoda(at)hpe(dot)com, pgsql-hackers(at)postgresql(dot)org, craig(at)2ndquadrant(dot)com
Subject: Re: Logical Replication and Character encoding
Date: 2017-04-06 01:32:27
Message-ID: 20170406.103227.75439937.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 5 Apr 2017 11:33:51 -0400, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote in <5401fef6-c0c0-7e8a-d8b1-169e30cbd854(at)2ndquadrant(dot)com>
> After further thinking, I prefer the alternative approach of using
> pq_sendcountedtext() as is and sticking the trailing zero byte on on the
> receiving side. This is a more localized change, and keeps the logical
> replication protocol consistent with the main FE/BE protocol. (Also, we
> don't need to send a useless byte around.)

I'm not sure about the significance of the trailing zero in the
the logical replication protocol. Anyway the patch works.

> Patch attached, and also a test case.

The problem was revealed when a string is shortened by encoding
conversion. The test covers the situation.

- The patches appliy on the master cleanly.
- The patch works for the UTF-8 => EUC_JP case.
- The test seems proper.

By the way, an untranslatable character on the publisher table
stops walsender with the following error.

> ERROR: character with byte sequence 0xe6 0xbc 0xa2 in encoding "UTF8" has no equivalent in encoding "LATIN1"
> STATEMENT: COPY public.t TO STDOUT
> LOG: could not send data to client: Broken pipe
> FATAL: connection to client lost

walreceiver stops on the opposite side with the following
complaint.

> ERROR: could not receive data from WAL stream: ERROR: character with byte sequence 0xe6 0xbc 0xa2 in encoding "UTF8" has no equivalent in encoding "LATIN1"
> CONTEXT: COPY t, line 1: ""
> LOG: worker process: logical replication worker for subscription 16391 sync 16384 (PID 26915) exited with exit code 1

After this, walreceiver repeats reconnecting to master with no
wait. Maybe walreceiver had better refrain from reconnection
after certain kinds of faiure but it is not an urgent issue.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-04-06 01:34:22 Re: Re: new set of psql patches for loading (saving) data from (to) text, binary files
Previous Message Noah Misch 2017-04-06 01:30:38 Re: Rewriting the test of pg_upgrade as a TAP test