Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Date: 2012-01-25 01:13:48
Message-ID: 24418.1327454028@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Tue, Jan 24, 2012 at 11:55 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I do wonder whether we are making a mountain out of a mole-hill here,
>> though. If I properly understand the proposal on the table, which
>> it's possible that I don't, but if I do, the new format is
>> self-identifying: when the optimization is in use, it sets a bit that
>> previously would always have been clear. So if we just go ahead and
>> change this, clients that have been updated to understand the new
>> format will work just fine. The server uses the proposed optimization
>> only for arrays that meet certain criteria, so any properly updated
>> client must still be able to handle the case where that bit isn't set.
>> On the flip side, clients that aren't expecting the new optimization
>> might break. But that's, again, no different than what happened when
>> we changed the default bytea output format. If you get bit, you
>> either update your client or shut off the optimization and deal with
>> the performance consequences of so doing.

> Well, the bytea experience was IMNSHO a complete disaster (It was
> earlier mentioned that jdbc clients were silently corrupting bytea
> datums) and should be held up as an example of how *not* to do things;

Yeah. In both cases, the (proposed) new output format is
self-identifying *to clients that know what to look for*. Unfortunately
it would only be the most anally-written pre-existing client code that
would be likely to spit up on the unexpected variations. What's much
more likely to happen, and did happen in the bytea case, is silent data
corruption. The lack of redundancy in binary data makes this even more
likely, and the documentation situation makes it even worse. If we had
had a clear binary-data format spec from day one that told people that
they must check for unexpected contents of the flag field and fail, then
maybe we could get away with considering not doing so to be a
client-side bug ... but I really don't think we have much of a leg to
stand on given the poor documentation we've provided.

> In regards to the array optimization, I think it's great -- but if you
> truly want to avoid blowing up user applications, it needs to be
> disabled automatically.

Right. We need to fix things so that this format will not be sent to
clients unless the client code has indicated ability to accept it.
A GUC is a really poor proxy for that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-01-25 02:33:52 Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Previous Message Tatsuo Ishii 2012-01-25 00:24:50 Re: basic pgbench runs with various performance-related patches