| From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> | 
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | parallel workers and client encoding | 
| Date: | 2016-06-07 01:45:04 | 
| Message-ID: | 1739a900-30ab-f48e-aec4-2b35475ecf02@2ndquadrant.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
There appears to be a problem with how client encoding is handled in the 
communication from parallel workers.  In a parallel worker, the client 
encoding setting is inherited from its creating process as part of the 
GUC setup.  So any plain-text stuff the parallel worker sends to its 
leader is actually converted to the client encoding.  Since most data is 
sent in binary format, the plain-text provision applies mainly to notice 
and error messages.  At the other end, error messages are parsed using 
pq_parse_errornotice(), which internally uses routines that were meant 
for communication from the client, and therefore will convert everything 
back from the client encoding to the server encoding.  So this whole 
thing actually happens to work as long as round tripping is possible 
between the involved encodings.
In cases where it isn't, it's still hard to notice the difference 
because depending on whether you get a parallel plan or not, the 
following happens:
not parallel: conversion error happens between server and client, client 
sees an error message about that
parallel: conversion error happens between worker and leader, worker 
generates an error message about that, sends it to leader, leader 
forwards it to client
The client sees the same error message in both cases.
To construct a case where this makes a difference, the leader has to be 
set up to catch certain errors.  Here is an example:
"""
create table test1 (a int, b text);
truncate test1;
insert into test1 values (1, 'a');
create or replace function test1() returns text language plpgsql
as $$
declare
   res text;
begin
   perform from test1 where a = test2();
   return res;
exception when division_by_zero then
   return 'boom';
end;
$$;
create or replace function test2() returns int language plpgsql
parallel safe
as $$
begin
   raise division_by_zero using message = 'Motörhead';
   return 1;
end
$$;
set force_parallel_mode to on;
select test1();
"""
With client_encoding = server_encoding, this will return a single row 
'boom'.  But with, say, database encoding UTF8 and 
PGCLIENTENCODING=KOI8R, it will error:
ERROR:  22P05: character with byte sequence 0xef 0xbe 0x83 in encoding 
"UTF8" has no equivalent in encoding "KOI8R"
CONTEXT:  parallel worker
(Note that changing force_parallel_mode does not force replanning in 
plpgsql, so if you run test1() first before setting force_parallel_mode, 
then you won't get the error.)
Attached is a patch to illustrates how this could be fixed.  There might 
be similar issues elsewhere.  The notification propagation in particular 
could be affected.
-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
| Attachment | Content-Type | Size | 
|---|---|---|
| parallel-client-encoding.patch | text/plain | 1.3 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Kyotaro HORIGUCHI | 2016-06-07 03:14:31 | Re: Parallel pg_dump's error reporting doesn't work worth squat | 
| Previous Message | Korbin Hoffman | 2016-06-06 23:57:10 | Re: hstore: add hstore_length function |