Re: Request for comment on setting binary format output per session

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Dave Cramer <davecramer(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Request for comment on setting binary format output per session
Date: 2023-10-04 14:17:28
Message-ID: 9bba81fa-210d-9dab-b37c-2fa26d2fe641@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31.07.23 18:27, Dave Cramer wrote:
> On Mon, 10 Jul 2023 at 03:56, Daniel Gustafsson <daniel(at)yesql(dot)se
> <mailto:daniel(at)yesql(dot)se>> wrote:
>
> > On 25 Apr 2023, at 16:47, Dave Cramer <davecramer(at)gmail(dot)com
> <mailto:davecramer(at)gmail(dot)com>> wrote:
>
> > Patch attached with comments removed
>
> This patch no longer applies, please submit a rebased version on top
> of HEAD.
>
>
> Rebased see attached

I have studied this thread now. It seems it has gone through the same
progression with the same (non-)result as my original patch on the subject.

I have a few intermediate conclusions:

- Doing it with a GUC is challenging. It's questionable layering to
have the GUC system control protocol behavior. It would allow weird
behavior where a GUC set, maybe for a user or a database, would confuse,
say, psql or pg_dump. We probably should make some of those more robust
in any case. Also, handling of GUCs through connection poolers is a
challenge. It does work, but it's more like opt-in, and so can't be
fully relied on for protocol correctness.

- Doing it with a session-level protocol-level setting is challenging.
We currently don't have that kind of thing. It's not clear how
connection poolers would/should handle it. Someone would have to work
all this out before this could be used.

- In either case, there are issues like what if there is a connection
pooler and types have different OIDs in different databases. (Or,
similarly, an extension is upgraded during the lifetime of a session and
a type's OID changes.) Also, maybe, what if types are in different
schemas on different databases.

- We could avoid some of the session-state issues by doing this per
request, like extending the Bind message somehow by appending the list
of types to be sent in binary. But the JDBC driver currently lists 24
types for which it supports binary, so that would require adding 24*4=96
bytes per request, which seems untenable.

I think intuitively, this facility ought to work like client_encoding.
There, the client declares its capabilities, and the server has to
format the output according to the client's capabilities. That works,
and it also works through connection poolers. (It is a GUC.) If we can
model it like that as closely as possible, then we have a chance of
getting it working reliably. Notably, the value space for
client_encoding is a globally known fixed list of strings. We need to
figure out what is the right way to globally identify types, like either
by fully-qualified name, by base name, some combination, how does it
work with extensions, or do we need a new mechanism like UUIDs. I think
that is something we need to work out, no matter which protocol
mechanism we end up using.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-10-04 14:35:32 Re: --sync-method isn't documented to take an argument
Previous Message Robert Haas 2023-10-04 13:42:43 Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers