Re: Logical Replication and Character encoding

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: noriyoshi(dot)shinoda(at)hpe(dot)com
Cc: peter(dot)eisentraut(at)2ndquadrant(dot)com, petr(dot)jelinek(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, craig(at)2ndquadrant(dot)com
Subject: Re: Logical Replication and Character encoding
Date: 2017-02-27 05:23:12
Message-ID: 20170227.142312.56921714.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for the abesnse.

At Fri, 24 Feb 2017 02:43:14 +0000, "Shinoda, Noriyoshi" <noriyoshi(dot)shinoda(at)hpe(dot)com> wrote in <AT5PR84MB00847ABEA48EAE9A97D51157EE520(at)AT5PR84MB0084(dot)NAMPRD84(dot)PROD(dot)OUTLOOK(dot)COM>
> >From: Peter Eisentraut [mailto:peter(dot)eisentraut(at)2ndquadrant(dot)com]
> >Sent: Friday, February 24, 2017 1:32 AM
> >To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>; Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> >Cc: craig(at)2ndquadrant(dot)com; Shinoda, Noriyoshi <noriyoshi(dot)shinoda(at)hpe(dot)com>; pgsql-hackers(at)postgresql(dot)org
> >Subject: Re: [HACKERS] Logical Replication and Character encoding
> >
> >On 2/17/17 10:14, Peter Eisentraut wrote:
> >> Well, it is sort of a libpq connection, and a proper libpq client
> >> should set the client encoding, and a proper libpq server should do
> >> encoding conversion accordingly. If we just play along with this, it
> >> all works correctly.
> >>
> >> Other output plugins are free to ignore the encoding settings (just
> >> like libpq can send binary data in some cases).
> >>
> >> The attached patch puts it all together.
> >
> >committed
..
> However, in the case of PUBLICATION(UTF-8) and SUBSCRIOTION(EUC_JP) environment, the following error was output and the process went down.
...
> LOG: starting logical replication worker for subscription "sub1"
> LOG: logical replication apply for subscription "sub1" has started
> ERROR: insufficient data left in message
> LOG: worker process: logical replication worker for subscription 16439 (PID 22583) exited with exit code 1

Yeah, the patch sends converted string with the length of the
orignal length. Usually encoding conversion changes the length of
a string. I doubt that the reverse case was working correctly.

As the result pg_sendstring is not usable for this case since we
don't have the true length of the string to be sent. So my first
patch did the same thing using pg_server_to_client() explicitly.

That being said, I think that a more important thing is that the
consensus about the policy of logical replication between
databases with different encodings is refusing connection. The
reason for that is it surely breaks BDR usage for some
combinations of encodings.

Anyway the attached patch fixes the current bug about encoding in
logical replication.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fix_logrep_conversion.patch text/x-patch 1022 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-02-27 05:32:00 Re: Automatic cleanup of oldest WAL segments with pg_receivexlog
Previous Message Nikhil Sontakke 2017-02-27 05:19:54 Re: Speedup twophase transactions