Re: Transform groups (more FE/BE protocol issues)

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Development <pgsql-hackers(at)postgreSQL(dot)org>, <pgsql-interfaces(at)postgreSQL(dot)org>
Subject: Re: Transform groups (more FE/BE protocol issues)
Date: 2003-05-06 00:31:39
Message-ID: Pine.LNX.4.44.0305052201430.1785-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-interfaces

Tom Lane writes:

> I've been thinking about this a little more; it seems to open up some
> questions about the current design of the new FE/BE protocol.

A transform group is a named object attached to a type. A transform group
converts a user-defined type to a standard SQL type or vice versa. (The
rationale being that clients know how to handle standard SQL types.) A
client selects a transform group as session state, either individually for
each type (SET TRANFORM GROUP FOR TYPE mytype 'mygroup') or using a
default mechanism (SET DEFAULT TRANSFORM GROUP 'mygroup'), which means
that all types that have a transform group with the given name activate
that group.

> * There are two places presently in the protocol where the client can
> specify text/binary via a boolean (represented as int8). To move to a
> transform-group world, we could redefine that field as a group selector:
> 0 = standard text representation, 1 = standard binary representation,
> other values = reserved for future implementation.

I don't think we need any more representations than those two, and the
transform group feature can be independent from this.

Here's an example: a set of transform groups (for various data types)
that convert SQL data types to more C-like types, for example timestamp to
struct tm. Would you want to pass struct tm-data over the wire as a blob
of 36 bytes? I would rather get a standard binary representation with a
length word, so a middle layer can still copy this data around without
having to be data type-aware.

In fact, this suggests that the format field should not be represented at
all in the protocol and only be handled as session state. Users can just
say "I want these types in text and these types in binary" and that is
respected for the rest of the session. Obviously there are some backward
compatibility problem in this.

> * The DataRow/BinaryRow distinction obviously doesn't scale to multiple
> transform groups. I propose dropping the BinaryRow message type in
> protocol 3.0, and instead carrying the format code (group selector)
> somewhere else. A straightforward conversion would be to add it to the
> body of DataRow, but I'm not convinced that's the best place; again,
> read on.

We would need a format code for each returned column.

It would be the decision of the transform group about which format to
return. If you want a transform group that provides your client
application with binary data, then you write your transform functions to
return bytea. If you want a text format, write them to return cstring.
(Other, less subtle flagging mechanisms could be invented.)
DataRow/BinaryRow basically only tells the client which method is used to
signal the end of data. Both methods have some use, but there aren't a
lot of other methods that will come into use soon.

> * At what granularity do you wish to select the transform group type for
> data being transferred in or out? Right now we've essentially assumed
> that you only need to specify it once for an entire command result, but
> it's fairly easy to imagine scenarios where this isn't what you want.

As mentioned above, it would be per data type and session. I think that
makes sense. As you mentioned, certainly you would want to have different
choices for different data types. Also, session makes sense, because if
an application is set up to handle a given type in a given format in one
place, then it probably wants to handle it like that every time. This
would move away the format decision from the command level to the session
level. Most configuration parameters are session level, so that makes
sense.

> * That leaves us with two issues: where does the client say what it
> wants,

SET-like commands; see at the top.

> and where does the backend report the actual transform group used for
> each column?

If you want to be pedantic, you don't report it at all, because the client
selected it, so it got it. The client will know how to read the data
pieces, because it knows whether they are text (terminating zero byte) or
binary (starts with length byte). What exactly is inside doesn't
necessarily need to be reported. (We don't report the date style in each
result either.) If you want to report it, the RowDescription with one OID
for each column would seem the best place.

> * More or less the same considerations apply for parameter values being
> sent in a Bind message. Here I'd opt for always sending a transform
> group for each parameter value being sent.

Same here; you don't need it. If you set the format to X, then you send
the data in format X, and things work. If the data does not conform to
format X, the tranform function will tell you soon enough.

> * The client can hardly be expected to select per-column transforms in
> Bind if it doesn't know the result column datatypes yet. In the
> protocol document as it stands today, there's no way to find out the
> result datatypes except a portal Describe --- which requires that you've
> already done Bind.

This is not a problem if transform groups are per-type, not per-column.

The consequence for the protocol: Keep the text/binary distinction, but
make it per-column in the result. For backward compatibility, the client
can still choose between text and binary on a command-level basis, but we
should move this to a session parameter, and if command and session
settings are incompatible, one prevails or we signal an error.

--
Peter Eisentraut peter_e(at)gmx(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-05-06 00:49:18 Re: Transform groups (more FE/BE protocol issues)
Previous Message Tom Lane 2003-05-06 00:02:06 Re: 7.4 features list

Browse pgsql-interfaces by date

  From Date Subject
Next Message Tom Lane 2003-05-06 00:49:18 Re: Transform groups (more FE/BE protocol issues)
Previous Message Tom Lane 2003-05-05 16:54:00 Re: Transform groups (more FE/BE protocol issues)