Re: parallel workers and client encoding

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel workers and client encoding
Date: 2016-06-29 16:41:36
Message-ID: CA+Tgmoa7nD_xUhC-k28akV9K==+uEomLmihgfBhpNF-brkRFEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 28, 2016 at 10:10 PM, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 6/27/16 5:37 PM, Robert Haas wrote:
>> Please find attached an a patch for a proposed alternative approach.
>> This does the following:
>>
>> 1. When the client_encoding GUC is changed in the worker,
>> SetClientEncoding() is not called.
>
> I think this could be a problem, because then the client encoding in the
> background worker process is inherited from the postmaster, which could in
> theory be anything. I think you need to set it at least once to the correct
> value.

Fixed in the attached version.

>> Thus, GetClientEncoding() in the
>> worker always returns the database encoding, regardless of how the
>> client encoding is set. This is better than your approach of calling
>> SetClientEncoding() during worker startup, I think, because the worker
>> could call a parallel-safe function with SET clause, and one of the
>> GUCs temporarily set could be client_encoding. That would be stupid,
>> but somebody could do it.
>
> I think if we're worried about this, then this should be an error, but
> that's a minor concern.

We can't actually throw an error at that point, because we really do
want the GUC to have the same value in the worker as it does in the
leader. Otherwise, anything that can observe the value of an
arbitrary GUC suddenly becomes parallel-restricted, which is certainly
unwanted.

> I realize that we don't have a good mechanism in the GUC code to distinguish
> these two situations.
>
> Then again, this shouldn't be so much different in concept from the
> restoring of GUC variables in the EXEC_BACKEND case. There is special code
> in set_config_option() to bypass some of the rules in that case.
> RestoreGUCState() should be able to get the same sort of pass.

I think we can use the existing flag InitializingParallelWorker to
handle this case. See attached.

> Also, set_config_option() knows something about what settings are allowed in
> a parallel worker, so I wonder if setting client_encoding would even work in
> spite of that?

I think that with this change, a SET client_encoding on a supposedly
parallel-safe function will just fail to have any effect when the
function runs inside a worker. The attached patch will make that kind
of thing fail outright when attempted inside a parallel worker.

>> 2. A new function pq_getmsgrawstring() is added. This is like
>> pq_getmsgstring() but it does no encoding conversion.
>> pq_parse_errornotice() is changed to use pq_getmsgrawstring() instead
>> of pq_getmsgstring(). Because of (1), when the leader receives an
>> ErrorResponse or NoticeResponse from the worker, it will not have been
>> subject to encoding conversion; because of this item, the leader will
>> not try to convert it either when initially parsing it. So the extra
>> encoding round-trip is avoided.
>
> I like that.
>
>> 3. The changes for NotifyResponse which you proposed are included
>> here, but with the modification that pq_getmsgrawstring() is used.
>
> and that.

Thanks for the review.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
parallel-encoding-fun-v2.patch invalid/octet-stream 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-29 16:58:47 Re: A couple of cosmetic changes around shared memory code
Previous Message Fabien COELHO 2016-06-29 16:27:53 Re: pgbench unable to scale beyond 100 concurrent connections