RE: BUG #15230: "Logical decoding" is not sensitive to client encoding setting

From: Hillel Eilat <Hillel(dot)Eilat(at)attunity(dot)com>
To: Euler Taveira <euler(at)timbira(dot)com(dot)br>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: BUG #15230: "Logical decoding" is not sensitive to client encoding setting
Date: 2018-06-17 09:54:50
Message-ID: DB5PR07MB14317A037DBE74B3A147F5E7F8720@DB5PR07MB1431.eurprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thanks.

1. As per your question - default (=textual) decoding mode is used.

2. Factually - client_encoding is set in the replication connection.
The problem is that it does not help.
Data which is streamed in, is represented in the server_encoding (Japanese in this case) while we expect UTF8 - which was set as client_encoding.

For being more specific - here is the essence of a piece of "C" code which is used for establishing the connection - via PQconnectdbParams(keywords, values, true);
This is the REPLICATION connection on which "START_REPLICATION SLOT "XXXXXXX" LOGICAL LLL/SSS" is executed later.
One would expect that data fetched in via PQgetCopyData(...) thereafter, will show up in client_encoding representation.
But this is not the case...

Your clarifications will be appreciated.

Thanks
Hillel.

char *pszClientEncoding = "UTF8"; // Set client encoding

i = 0; // Initial Array index

keywords[i] = "dbname";
values[i] = pszDbName == NULL ? "replication" : pszDbName;
i++;
keywords[i] = "replication";
values[i] = pszDbName == NULL ? "true" : "database";
i++;
keywords[i] = "fallback_application_name";
values[i] = pszProgName;
i++;

if (pszDbHost)
{
keywords[i] = "host";
values[i] = pszDbHost;
i++;
}
if (pszDbUser)
{
keywords[i] = "user";
values[i] = pszDbUser;
i++;
}
if (pszDbPort)
{
keywords[i] = "port";
values[i] = pszDbPort;
i++;
}

if (pszClientEncoding) // Set client encoding
{
keywords[i] = "client_encoding";
values[i] = pszClientEncoding;
i++;
}

/* Prompting for password here is not a matter of interest (the -"W" connad option) */
//need_password = (dbgetpassword == 1 && dbpassword == NULL);

need_password = 0; // No point in this mechanism here

//do
{
if (pszDbPassword)
{
keywords[i] = "password";
values[i] = pszDecryptedPassword;
}
else
{
keywords[i] = NULL;
values[i] = NULL;
}

tmpconn = PQconnectdbParams(keywords, values, true);

if (!tmpconn)
{
pSetup->config.logger_error((char *)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,"PQconnectdbParams(...) - Could not connect to the server.");
return NULL;
}

if (PQstatus(tmpconn) == CONNECTION_BAD && PQconnectionNeedsPassword(tmpconn) && dbgetpassword != -1)
{
AT_STR->snprintf(szMsg, sizeof(szMsg), "Could not connect to server. Missing or improper password: %s",ar_PQerrorMessage(tmpconn));
pSetup->config.logger_error((char *)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,szMsg);
ar_PQfinish(tmpconn);
return NULL;
}
}
//while (need_password);

-----Original Message-----
From: Euler Taveira [mailto:euler(at)timbira(dot)com(dot)br]
Sent: Thursday, June 14, 2018 5:28 PM
To: Hillel Eilat <Hillel(dot)Eilat(at)attunity(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15230: "Logical decoding" is not sensitive to client encoding setting

2018-06-05 5:29 GMT-03:00 PG Bug reporting form <noreply(at)postgresql(dot)org>:
> The plugin used is the common "test_decoding", which is shipped
> together with the kit.
>
What is the test_decoding output mode? By default, it uses textual mode. Did you set binary mode (foce-binary=1)?

> There is a Japanese database for which encoding is defined as ""EUC_JP".
> Ordinarily - we process the streamed data in UTF8 client encoding -
> thus maintaining a common general "consumer" functions.
> Consequently, prior to issuing PQconnectdbParams(keywords, values,
> true) - a {"client_encoding","UTF8"} couple is introduced.
> To be on the safe side - a couple of PQclientEncoding(pConn) /
> pg_encoding_to_char(iClientEncoding) is issued thereafter, for
> approving that UTF8 was properly set.
>
client_encoding should be set in the replication connection because if you set it later it won't be passed down to libpqwalreceiver.

[1] https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F9.4%2Fstatic%2Flogicaldecoding-output-plugin.html%23LOGICALDECODING-OUTPUT-MODE&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=i4ViTGALzy04B%2F9GU4MToSVYJLCDxCxZahqChrax%2Bdk%3D&reserved=0

--
Euler Taveira Timbira -
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=NOwGcjs2uIMGLCp6JaCjixKzL3mGDZVGxPJxo5m4UUo%3D&reserved=0
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2018-06-17 12:06:53 Re: BUG #15245: pg_stat_all_tables does not include partition master tables
Previous Message PG Bug reporting form 2018-06-17 08:54:06 BUG #15245: pg_stat_all_tables does not include partition master tables