Re: [INTERFACES] Revised proposal for libpq and FE/BE protocol changes

From: watts(at)humbug(dot)antnet(dot)com
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-interfaces(at)postgresql(dot)org, watts(at)humbug(dot)antnet(dot)com
Subject: Re: [INTERFACES] Revised proposal for libpq and FE/BE protocol changes
Date: 1998-04-28 20:53:24
Message-ID: 199804282053.PAA21962@humbug.antnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-interfaces

I suggest the application already has fork or fork/exec to
implement an asynchronous design. Does that also keep the
socket out of the application's domain?

Bob
watts(at)humbug(dot)antnet(dot)com

Received: from hub.org (hub.org [209.47.148.200])
by humbug.antnet.com (8.8.5/8.8.5) with ESMTP id LAA21503
for <watts(at)humbug(dot)antnet(dot)com>; Tue, 28 Apr 1998 11:28:48 -0500 (CDT)
Received: from localhost (majordom(at)localhost) by hub.org (8.8.8/8.7.5) with SMTP id MAA01511; Tue, 28 Apr 1998 12:23:18 -0400 (EDT)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 28 Apr 1998 12:23:16 -0400 (EDT)
Received: (from majordom(at)localhost) by hub.org (8.8.8/8.7.5) id MAA01498 for pgsql-interfaces-outgoing; Tue, 28 Apr 1998 12:23:09 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id MAA01401; Tue, 28 Apr 1998 12:22:04 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id MAA07043;
Tue, 28 Apr 1998 12:21:56 -0400 (EDT)
To: pgsql-hackers(at)postgreSQL(dot)org, pgsql-interfaces(at)postgreSQL(dot)org
Subject: [INTERFACES] Revised proposal for libpq and FE/BE protocol changes
Date: Tue, 28 Apr 1998 12:21:55 -0400
Message-ID: <7040(dot)893780515(at)sss(dot)pgh(dot)pa(dot)us>
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sender: owner-pgsql-interfaces(at)hub(dot)org
Precedence: bulk

Here is a revised proposal that takes into account the discussions
of the last few days. Any comments?

I propose to revise libpq and modify the frontend/backend protocol
to provide the following benefits:
* Provide a clean way of reading multiple results from a single query
string. Among other things, this solves the problem of allowing a
single query to return several result sets with different descriptors.
* Allow a frontend to perform other work while awaiting the result of
a query.
* Add the ability to cancel queries in progress.
* Eliminate the need for frontends to issue dummy queries in order
to detect NOTIFY responses.
* Eliminate the need for libpq to issue dummy queries internally
to determine when a query is complete.

We can't break existing code for this, so the behavior of PQexec()
can't change. Instead, I propose new functions to add to the API.
Internally, PQexec will be reimplemented in terms of these new
functions, but old applications shouldn't notice any difference.

The new functions are:

bool PQsendQuery (PGconn *conn, const char *query);

Submits a query without waiting for the result. Returns TRUE if the
query has been successfully dispatched, otherwise FALSE (in the FALSE
case, an error message is left in conn->errorMessage).

PGresult* PQgetResult (PGconn *conn);

Waits for input from the backend, and consumes input until (a) a result is
available, (b) the current query is over, or (c) a copy in/out operation
is detected. NULL is returned if the query is over; in all other cases a
suitable PGresult is returned (which the caller must eventually free).
Note that no actual "wait" will occur if the necessary input has already
been consumed; see below.

bool PQisBusy (PGconn *conn);

Returns TRUE if a query operation is busy (that is, a call to PQgetResult
would block waiting for more input). Returns FALSE if PQgetResult would
return immediately.

void PQconsumeInput (PGconn *conn);

This can be called at any time to check for and process new input from
the backend. It returns no status indication, but after calling it
the application can use PQisBusy() and/or PQnotifies() to see if a query
was completed or a NOTIFY message arrived. This function will never wait
for more input to arrive.

int PQsocket (PGconn *conn);

Returns the Unix file descriptor for the socket connection to the backend,
or -1 if there is no open connection. This is a violation of modularity,
of course, but there is no alternative: an application that needs
asynchronous execution needs to be able to use select() to wait for input
from either the backend or any other input streams it may have. To use
select() the underlying socket must be made visible.

PGnotify *PQnotifies (PGconn *conn);

This function doesn't change; we just observe that notifications may
become available as a side effect of executing either PQgetResult() or
PQconsumeInput(), not just PQexec().

void PQrequestCancel (PGconn *conn);

Issues a cancel request if possible. There is no direct way to tell whether
this has any effect ... see discussion below.

Discussion:

An application can continue to use PQexec() as before, and notice
very little difference in behavior.

Applications that want to be able to handle multiple results from a
single query should replace PQexec calls with logic like this:

// Submit the query
if (! PQsendQuery(conn, query))
reportTheError();
// Wait for and process result(s)
while ((result = PQgetResult(conn)) != NULL) {
switch (PQresultStatus(result)) {
... process result, for example:
case PGRES_COPY_IN:
// ... copy data here ...
if (PQendcopy(conn))
reportTheError();
break;
...
}
PQclear(result);
}
// When fall out of loop, we're done and ready for a new query

Note that PQgetResult will always report errors by returning a PGresult
with status PGRES_NONFATAL_ERROR or PGRES_FATAL_ERROR, not by returning
NULL (since NULL implies non-error termination of the processing loop).

PQexec() will be implemented as follows:

if (! PQsendQuery(conn, query))
return makeEmptyPGresult(conn, PGRES_FATAL_ERROR);
lastResult = NULL;
while ((result = PQgetResult(conn)) != NULL) {
PQclear(lastResult);
lastResult = result;
}
return lastResult;

This maintains the current behavior that the last result of a series
of commands is returned by PQexec. (The old implementation is only
capable of doing that correctly in a limited set of cases, but in the
cases where it behaves usefully at all, that's how it behaves.)

There is a small difference in behavior, which is that PQexec will now
return a PGresult with status PGRES_FATAL_ERROR in cases where the old
implementation would just have returned NULL (and set conn->errorMessage).
However, any correctly coded application should handle this the same way.

In the above examples, the frontend application is still synchronous: it
blocks while waiting for the backend to reply to a query. This is often
undesirable, since the application may have other work to do, such as
responding to user input. Applications can now handle that by using
PQisBusy and PQconsumeInput along with PQsendQuery and PQgetResult.

The general idea is that the application's main loop will use select()
to wait for input (from either the backend or its other input sources).
When select() indicates that input is pending from the backend, the app
will call PQconsumeInput, followed by checking PQisBusy and/or PQnotifies
to see what has happened. If PQisBusy returns FALSE then PQgetResult
can safely be called to obtain and process a result without blocking.

Note also that NOTIFY messages can arrive asynchronously from the backend.
They can be detected *without issuing a query* by calling PQconsumeInput
followed by PQnotifies. I expect a lot of people will build "partially
async" applications that detect notifies this way but still do all their
queries through PQexec (or better, PQsendQuery followed by a synchronous
PQgetResult loop). This compromise allows notifies to be detected without
wasting time by issuing null queries, yet the basic logic of issuing a
series of queries remains simple.

Finally, since the application can retain control while waiting for a
query response, it becomes meaningful to try to cancel a query in progress.
This is done by calling PQrequestCancel(). Note that PQrequestCancel()
may not have any effect --- if there is no query in progress, or if the
backend has already finished the query, then it *will* have no effect.
The application must continue to follow the result-reading protocol after
issuing a cancel request. If the cancel is successful, its effect will be
to cause the current query to fail and return an error message.

PROTOCOL CHANGES:

We should change the protocol version number to 2.0.
It would be possible for the backend to continue to support 1.0 clients,
if you think it's worth the trouble to do so.

1. New message type:

Command Done
Byte1('Z')

The backend will emit this message at completion of processing of every
command string, just before it resumes waiting for frontend input.
This change eliminates libpq's current hack of issuing empty queries to
see whether the backend is done. Note that 'Z' must be emitted after
*every* query or function invocation, no matter how it terminated.

2. The RowDescription ('T') message is extended by adding a new value
for each field. Just after the type-size value, there will now be
an int16 "atttypmod" value. (Would someone provide text specifying
exactly what this value means?) libpq will store this value in
a new "adtmod" field of PGresAttDesc structs.

3. The "Start Copy In" response message is changed from 'D' to 'G',
and the "Start Copy Out" response message is changed from 'B' to 'H'.
These changes eliminate potential confusion with the data row messages,
which also have message codes 'D' and 'B'.

4. The frontend may request cancellation of the current query by sending
a single byte of OOB (out-of-band) data. The contents of the data byte
are irrelevant, since the cancellation will be triggered by the associated
signal and not by the data itself. (But we should probably specify that
the byte be zero, in case we later think of a reason to have different
kinds of OOB messages.) There is no specific reply to this message.
If the backend does cancel a query, the query terminates with an ordinary
error message indicating that the query was cancelled.

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Byron Nikolaidis 1998-04-28 21:32:43 Re: [INTERFACES] Access'97 and ODBC
Previous Message Richard Lynch 1998-04-28 19:46:34 Re: [QUESTIONS] copy command

Browse pgsql-interfaces by date

  From Date Subject
Next Message Byron Nikolaidis 1998-04-28 21:32:43 Re: [INTERFACES] Access'97 and ODBC
Previous Message Ken J. Wright 1998-04-28 18:16:32 Borland BDE / blank table names