Re: FE/BE Protocol 3.0 Draft - Some Comments

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ljb <lbayuk(at)mindspring(dot)com>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: FE/BE Protocol 3.0 Draft - Some Comments
Date: 2003-04-21 01:28:33
Message-ID: 21549.1050888513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

ljb <lbayuk(at)mindspring(dot)com> writes:
> 1) I think the ErrorResponse message is way too complicated.

Agreed, it's much richer than it was before, but I disagree with your
opinion that empty complexity is being added. There are good examples
in the existing error message repertoire for each of the concepts I
propose adding. For instance, here are two examples of existing
messages that are offering hints:

elog(ERROR, "Unable to identify an operator '%s' for types '%s' and '%s'"
"\n\tYou will have to retype this query using an explicit cast",
NameListToString(op),
format_type_be(arg1), format_type_be(arg2));

fprintf(stderr, "IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %s\n",
(int) memKey, size, (IPC_CREAT | IPC_EXCL | IPCProtection),
strerror(errno));

if (errno == EINVAL)
fprintf(stderr,
"\nThis error usually means that PostgreSQL's request for a shared memory\n"
"segment exceeded your kernel's SHMMAX parameter. You can either\n"
"reduce the request size or reconfigure the kernel with larger SHMMAX.\n"
"To reduce the request size (currently %u bytes), reduce\n"
"PostgreSQL's shared_buffers parameter (currently %d) and/or\n"
"its max_connections parameter (currently %d).\n"
"\n"
"If the request size is already small, it's possible that it is less than\n"
"your kernel's SHMMIN parameter, in which case raising the request size or\n"
"reconfiguring SHMMIN is called for.\n"
"\n"
"The PostgreSQL documentation contains more information about shared\n"
"memory configuration.\n\n",
size, NBuffers, MaxBackends);

Continuing to classify all that verbiage as primary error message isn't
going to make anyone's life simpler. On the other hand, if we split it
up into message/detail/hint as I was proposing, I think life *will* get
simpler for GUI clients (pgAdmin and suchlike) which won't have to
be prepared to cope with many-line primary error messages. Admittedly,
it will take them some work to make the extra info accessible via popup
windows or whatever --- but I think they'll be glad to do it. And if
they don't do it, well, not that much is being lost if the extra
verbiage doesn't get displayed.

Please note that I'm *not* proposing that all or even most error
messages should carry detail or hint fields. I'm just planning to split
existing messages into those parts where it seems appropriate, which is
probably only a few dozen places. The hints involved aren't silly, for
the most part anyway --- they got there through bitter experience.

> But the current v3 message spec calls for parsing out
> an indefinite number of up to 10 different sub-types (3 required, 7
> optional), and I don't see the need for this complexity.

You're welcome to ignore fields you don't care about. I do see a spec
deficiency here, which is that I intended at most one of any particular
field type per message, whereas you're evidently worrying what to do
with multiple instances. Will improve the document.

> To me, the other fields are not useful to most of us (file, linenumber,
> call-stack traceback), overkill (detail), or silly (hint).

Call-stack tracebacks are already there, in a limited form, and are
*very* useful to the people who get them:

regression=# create function foo(float8) returns float8 as '
regression'# begin
regression'# return 1.0 / $1;
regression'# end' language plpgsql;
CREATE FUNCTION

regression=# select foo(0);
WARNING: Error occurred while executing PL/pgSQL function foo
WARNING: line 2 at return
ERROR: division by zero
regression=#

Again, I'm not proposing adding new functionality, just relabeling
information that is already being issued so that receiving clients
can actually understand what it is and do something more reasonable
than they do now. Those "warning" lines are not obviously related
to the ERROR as far as existing client code can make out, and too
many clients just drop them on the floor.

As for file and linenumber, agreed those are of use to hackers only
--- but those of us who have been puzzling over trouble reports for
any length of time know how badly they are needed when they're needed.
I would rather put them into a separate field of the error report,
in which client software has the option to ignore them if it's not
showing error details, than put them right in the main error text.
(That was the only alternative we had up to now, thus it never got
done because the clutter factor was too high.)

Now we could go down a different path wherein we try to deal these
things on the server side --- for example, have some GUC variables
that control whether hints and location and such get embedded into
a single error string that's transmitted to the client. But it seems
to me that that's catering to dumb clients rather than giving the
clients the tools they need to be smart. I'd rather design a protocol
that lets user-interface decisions be made on the client side.

> 2) Please don't deprecate FunctionCall, nor say that a prepared "SELECT
> function($1)" can replace it. What about sending binary data to the backend
> (without having to escape it)? Is there some other way than FunctionCall?

The Bind message can do it.

> How would the large object interface work without FunctionCall?

You could build it out of spare parts using a set of prepared
statements, one for each underlying LO function. Execution would send
Bind and Execute in place of the FunctionCall message. I'm not currently
planning to actually rewrite libpq's LO code in this round, but in the
long run it would make sense to do so.

> 3) There is still no way to pass NULL as a FunctionCall argument; now is
> your chance to fix it if you want to. Not that I've seen a need for it.

I was originally planning to do that, but since I now see FunctionCall
as vestigial, there's not much point in adding more incompatibility to
it than we absolutely must.

> 4) Is there a good reason to continue having DataRow's field size value
> include itself, but BinaryRow's field size value not include itself?

No, this bothers me too. I would like to rationalize the representation
of binary data more, but exactly what to do is still an open question.
We have some conflicting precedents (FunctionCall vs. COPY BINARY file
formats), and the wild card in the whole thing is the possibility of
interposing format-translation routines (the old send/receive functions
again) to achieve some semblance of platform independence.

> For the record, I prefer to have lengths not
> include themselves. This applies to the new overall message length.

I would have done it that way, but that would require making
StartupMessage a special case (or more special than it is already),
since the precedent is already established that its length word includes
itself. We can't change that if we want to continue accepting
connections from old clients.

Also, if we don't introduce send/receive functions then I think we
really need to migrate BinaryRow towards length-includes-itself, because
that's what the underlying backend convention for varlena is. The
existing behavior of the protocol and libpq is actively broken on
machines with MAXALIGN > 4 because the representation exposed by libpq
doesn't align the contents of a varlena field correctly. (But adding
send/receive functions would allow us to avoid that issue, since the
on-the-wire representation wouldn't have to be the same as the backend's
anymore.)

> 5) Backward compatibility: Realizing that the primary focus will be
> V2 clients talking to V2/V3 servers ("upgrade your servers before your
> clients"), it would still be nice if V2/V3 clients could talk to a V2 server.

Yeah, I know, but I'm not personally willing to expend the effort needed
to make a dual-protocol V2/V3 libpq. If someone else wants to look at
that after the dust settles, they're surely welcome to. (We did get
away without a backwards-compatible libpq in the last protocol revision,
back at release 6.4, and there were relatively few complaints. So I'm
assuming we can do it again.)

> I see a problem here. The V2/V3 client writes a V3 startup packet to the
> server, and the V2 server responds with an error "FATAL: unsupported
> frontend protocol" and closes the connection. OK, the client can then
> retry connecting with V2 protocol. The problem is that the error message
> from the V2 server is in V2 format: byte1('E'), String Message.

Yeah, I've been wondering about that myself. One fairly straightforward
hack that could be used is to have a small limit (a few hundred) on the
expected response-message length to the startup packet. Whatever your
endianness, the V2 packet would look to have a huge length, and that
could trigger fallback code that parses it as V2 format. Or just punt
and assume the message must be complaining about protocol mismatch.
If you've got a better idea I'm all ears ...

regards, tom lane

In response to

Responses

Browse pgsql-interfaces by date

  From Date Subject
Next Message Glenn R Williams 2003-04-21 18:10:08 pl/python exceptions.ImportError: No module named _sre
Previous Message ljb 2003-04-21 00:20:26 FE/BE Protocol 3.0 Draft - Some Comments