Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Mikko Tiihonen <mikko(dot)tiihonen(at)nitorcreations(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Date: 2012-01-23 17:38:50
Message-ID: 20120123173850.GA13695@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 11:20:52AM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, Jan 23, 2012 at 9:59 AM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
> >> Now that I think about it, same applies to bytea_output?
>
> > Probably so. But I think we need not introduce quite so many new
> > threads on this patch. This is, I think, at least thread #4, and
> > that's making the discussion hard to follow.
>
> Well, this is independent of the proposed patch, so I think a separate
> thread is okay. The question is "shouldn't bytea_output be marked
> GUC_REPORT"? I think that probably it should be, though I wonder
> whether we're not too late. Clients relying on it to be transmitted are
> not going to work with existing 9.0 or 9.1 releases; so maybe changing
> it to be reported going forward would just make things worse.

Well, in a complex setup it can change under you at will,
but as clients can process the data without knowing the
server state, maybe it's not a big problem. (Unless there
are old clients in the mix...)

Perhaps we can leave it as-is?

But this leaves the question of future policy for
data format change in protocol. Note I'm talking
about both text and binary formats here together.
Although we could have different policy for them.

Also note that any kind of per-session flag is basically a GUC.

Question 1 - how does client know about which format data is?

1) new format is detectable from lossy GUC
2) new format is detectable from GUC_REPORT
3) new format is detectable from Postgres version
4) new format was requested in query (V4 proto)
5) new format is detectable from data (\x in bytea)

1. obviously does not work.
2. works, but requires changes across all infrastructure.
3. works and is simple, but painful.
4. is good, but in the future
5. is good, now

Question 2 - how does client request new format?

1) Postgres new version forces it.
2) GUC_REPORT + non-detectable data
3) Lossy GUC + autodetectable data
4) GUC_REPORT + autodetectable data
5) Per-request data (V4 proto)

1. is painful
2. is painful - all infra components need to know about the GUC.
3&4. are both ugly and non-maintanable in long term. Only
difference is that with 3) the infrastructure can give slight
guarantees that it does not change under client.
4. seems good...

Btw, it does not seems that per-request metainfo change requires
"major version". It just client can send extra metainfo packet
before bind+execute, if it knows server version is good enough.
For older servers it can simply skip the extra info. [Oh yeah,
that requires data format is autodetectable, always.]

My conclusions:

1. Any change in data format should be compatible with old data.
IOW - if client requested new data format, it should always
accept old format too.

2. Can we postpone minor data format changes on the wire until there
is proper way for clients to request on-the-wire formats?

--
marko

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2012-01-23 18:38:49 Re: Collect frequency statistics for arrays
Previous Message Marko Kreen 2012-01-23 16:33:41 Re: Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility