Re: Speed dblink using alternate libpq tuple storage

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-24 00:22:24
Message-ID: 20120324002224.GA19635@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I saw Kyotaro already answered, but I give my view as well.

On Thu, Mar 22, 2012 at 06:07:16PM -0400, Tom Lane wrote:
> AFAICT it breaks async processing entirely, because many changes have been
> made that fail to distinguish "insufficient data available as yet" from
> "hard error". As an example, this code at the beginning of
> getAnotherTuple:
>
> /* Get the field count and make sure it's what we expect */
> if (pqGetInt(&tupnfields, 2, conn))
> ! return EOF;
>
> is considering three cases: it got a 2-byte integer (and can continue on),
> or there aren't yet 2 more bytes available in the buffer, in which case it
> should return EOF without doing anything, or pqGetInt detected a hard
> error and updated the connection error state accordingly, in which case
> again there is nothing to do except return EOF. In the patched code we
> have:
>
> /* Get the field count and make sure it's what we expect */
> if (pqGetInt(&tupnfields, 2, conn))
> ! {
> ! /* Whole the message must be loaded on the buffer here */
> ! errmsg = libpq_gettext("protocol error\n");
> ! goto error_and_forward;
> ! }
>
> which handles neither the second nor third case correctly: it thinks that
> "data not here yet" is a hard error, and then makes sure it is an error by
> destroying the parsing state :-(. And if in fact pqGetInt did log an
> error, that possibly-useful error message is overwritten with an entirely
> useless "protocol error" text.

No, "protocol error" really is only error case here.

- pqGetInt() does not set errors.

- V3 getAnotherTuple() is called only if packet is fully in buffer.

> I don't think the error return cases for the row processor have been
> thought out too well either. The row processor is not in charge of what
> happens to the PGresult, and it certainly has no business telling libpq to
> just "exit immediately from the topmost libpq function". If we do that
> we'll probably lose sync with the data stream and be unable to recover use
> of the connection at all. Also, do we need to consider any error cases
> for the row processor other than out-of-memory?

No, the rule is *not* "exit to topmost", but "exit PQisBusy()".

This is exactly so that if any code that does not expect row-processor
behaviour continues to work.

Also, from programmers POV, this also means row-processor callback causes
minimal changes to existing APIs.

> If so it might be a good
> idea for it to have some ability to store a custom error message into the
> PGconn, which it cannot do given the current function API.

There already was such function, but it was row-processor specific hack
that could leak out and create confusion. I rejected it. Instead there
should be generic error setting function, equivalent to current libpq
internal error setting.

But such generic error setting function would need review all libpq
error states as it allows error state appear in new situations. Also
we need to have well-defined behaviour of client-side errors vs. incoming
server errors.

Considering that even current cut-down patch is troubling committers,
I would definitely suggest postponing such generic error setter to 9.3.

Especially as it does not change anything coding-style-wise.

> In the same vein, I am fairly uncomfortable with the blithe assertion that
> a row processor can safely longjmp out of libpq. This was never foreseen
> in the original library coding and there are any number of places that
> that might break, now or in the future. Do we really need to allow it?
> If we do, it would be a good idea to decorate the libpq functions that are
> now expected to possibly longjmp with comments saying so. Tracing all the
> potential call paths that might be aborted by a longjmp is an essential
> activity anyway.

I think we *should* allow exceptions, but in limited number of APIs.

Basically, the usefulness for users vs. effort from our side
is clearly on the side of providing it.

But its up to us to define what the *limited* means (what needs
least effort from us), so that later when users want to use exceptions
in callback, they need to pick right API.

Currently it seems only PQexec() + multiple SELECTS can give trouble,
as previous PGresult is kept in stack. Should we unsupport
PQexec or multiple SELECTS?

But such case it borken even without exceptions - or at least
very confusing. Not sure what to do with it.

In any case, "decorating" libpq functions is wrong approach. This gives
suggestion that caller of eg. PQexec() needs to take care of any possible
behaviour of unknown callback. This will not work. Instead allowed
functions should be simply listed in row-processor documentation.

Basically custom callback should be always matched by caller that
knows about it and knows how to handle it. Not sure how to put
such suggestion into documentation tho'.

> Another design deficiency is PQskipResult(). This is badly designed for
> async operation because once you call it, it will absolutely not give back
> control until it's read the whole query result. (It'd be better to have
> it set some kind of flag that makes future library calls discard data
> until the query result end is reached.) Something else that's not very
> well-designed about it is that it might throw away more than just incoming
> tuples. As an example, suppose that the incoming data at the time you
> call it consists of half a dozen rows followed by an ErrorResponse. The
> ErrorResponse will be eaten and discarded, leaving the application no clue
> why its transaction has been aborted, or perhaps even the entire session
> cancelled. What we probably want here is just to transiently install a
> row processor that discards all incoming data, but the application is
> still expected to eventually fetch a PGresult that will tell it whether
> the server side of the query completed or not.

I guess it's designed for rolling connection forward in exception
handler... And it's blocking-only indeed. Considering that better
approach is to drop the connection, instead trying to save it,
it's usefulness is questionable.

I'm OK with dropping it.

> Lastly, as to the pqgetrow patch, the lack of any demonstrated test case
> for these functions makes me uncomfortable as to whether they are well
> designed. Again, I'm unconvinced that the error handling is good or that
> they work sanely in async mode. I'm inclined to just drop these for this
> go-round, and to stick to providing the features that we can test via the
> dblink patch.

I simplified the github test cases and attached as patch.

Could you please give more concrete critique of the API?

Main idea behing PQgetRow is that it does not replace any existing
API function, instead acts as addition:

* Sync case: PQsendQuery() + PQgetResult - PQgetRow should be called
before PQgetResult until it returns NULL, then proceed with PQgetResult
to get final state.

* Async case: PQsendQuery() + PQconsumeInput() + PQisBusy() + PQgetResult().
Here PQgetRow() should be called before PQisBusy() until it returns
NULL, then proceed with PQisBusy() as usual.

It only returns rows, never any error state PGresults.

Main advantage of including PQgetRow() together with low-level
rowproc API is that it allows de-emphasizing more complex parts of
rowproc API (exceptions, early exit, skipresult, custom error msg).
And drop/undocument them or simply call them postgres-internal-only.

--
marko

Attachment Content-Type Size
rowproc-demos.diff text/x-diff 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dan McGee 2012-03-24 04:17:18 [PATCH] Never convert n_distinct < 2 values to a ratio when computing stats
Previous Message Marti Raudsepp 2012-03-23 23:41:11 Re: Refactoring simplify_function (was: Caching constant stable expressions)