Re: PQputCopyEnd doesn't adhere to its API contract

From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQputCopyEnd doesn't adhere to its API contract
Date: 2014-09-08 22:20:25
Message-ID: CAKFQuwaO7b2PVoaVYO8NeOtN3k_JNFsu8C3EqXz68A_Eq8Bang@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 8, 2014 at 11:45 AM, Robert Haas [via PostgreSQL] <
ml-node+s1045698n5818200h24(at)n5(dot)nabble(dot)com> wrote:

> On Thu, Sep 4, 2014 at 6:38 PM, David Johnston
> <[hidden email] <http://user/SendEmail.jtp?type=node&node=5818200&i=0>>
> wrote:
>
> > One of the trade-offs I mentioned...its more style than anything but
> > removing the parenthetical (if there is not error in the command) and
> > writing it more directly seemed preferable in an overview such as this.
> >
> > Better: The function will either throw an error or return a PGresult
> > object[...]
>
> Nope. This is not C++, nor is it the backend. It will not throw
> anything.
>
>
​The current sentence reads:
"​The response to this (if there is no error in the command) will be a
PGresult object bearing a status code of PGRES_COPY_OUT or PGRES_COPY_IN
(depending on the specified copy direction)."

Maybe this is taken for granted by others, and thus can be excluded here,
but I'm trying to specify what happens if there is an error in the
command... Apparently one does not get back a PGresult object...

Would simply saying: "A successful response to this will be a PGresult
object..." be accurate (enough)?

> > I appreciate the time you have taken on this and will look at my
> thoughts
> > with the new understanding you have given me.
> >
> > Thank You!
>
> Thanks for getting involved. I apologize if I was brusque here or at
> other times in the past (or future). Unfortunately I've been insanely
> busy and it doesn't bring out the best in me, but I really do value
> your (and everyone's efforts) to move PostgreSQL forward.
>
>
​The tone of all your responses, including the first one pointing out my
lack of supporting context and initially over-ambitious patch, have been
very professional. A lot of what you've said resonates since I think
subconsciously I understood that I needed to make fundamental changes to my
approach and style but it definitely helps to have someone point out
specific items and concerns to move those thoughts to the conscious part of
the mind. Thank you again for doing that for me.

​Ignoring style and such did anything I wrote help to clarify your
understanding of why pgPutCopyEnd does what it does? As I say this and
start to try and read the C code I think that it is a good source for
understanding novice assumptions but there is a gap between the docs and
the code - though I'm not sure you've identified the only (or even proper)
location.

Unlike PQputCopyData there is no explicit logic (pqIsnonblocking(conn)" in
PQputCopyEnd for non-blocking mode (there is an implicit one via pqFlush).
This does seem like an oversight - if a minor one since the likihood of
not being able to add the EOF to the connection's buffer seems highly
unlikely - but I would expect on the basis of symmetry alone - for both of
them to have buffer filled testing logic. And, depending on how large
*errormsg is, the risk of being unable to place the data in the buffer
isn't as small and the expected EOF case.

I'm getting way beyond my knowledge level here but my assumption from the
documentation was that the async mode behavior of returning zero revolved
around retrying in order to place the data into the buffer - not retrying
to make sure the buffer is empty. The caller deals with that on their own
based upon their blocking mode decision. Though we would want to call
pqFlush for blocking mode callers here since this function should block
until the buffer is clear.

​So, I thought I agreed with your suggestion that if the final pqFlush
returns a 1 that pqPutCopyEnd should return 0. Additionally, if in
non-blocking mode, and especially if *errormsg is not NULL, the available
connection buffer length logic in pqPutCopyData should be evaluated here as
well.

However, the 0 being returned due to the pqFlush really isn't anything
special for a non-blocking mode user (they have already been told to call
pqFlush themselves after calling pqPutCopyEnd) and doesn't apply for
blocking mode. Admittedly they could skip their call if they get a return
value of 1 from pqPutCopyEnd but I'm not sure recommending that
optimization has much going for it. And, again as you said (I am just
discovering this for myself as much as possible), if the pqFlush caused the
0 you would not want to retry whereas in the filled buffer version you
would. So we do have to pick one or the other situation and adjust the
documentation accordingly.

The most useful and compatible solution is to make pqPutCopyEnd synchronous
regardless of the user selected blocking mode - which it mostly is now but
the final pqFlush should be in a loop - and adjust the documentation in the
areas noted in my patch-email to accommodate that fact.

Regardless, the logic surrounding placing the *errormsg string into the
buffer needs affirmation and a note whether it will block waiting to be put
on a full connection buffer.

Note that non-blocking users seeing a 1 on the pqPutCopyEnd probably still
end up doing their own pqFlush looping calls but that is not something that
we can be certain of. What happens if the buffer isn't empty and the
non-blocking caller invokes pqGetResult(...) on the connection because we
returned a 1 from pqPutCopyEnd? Two situations, the buffer would be
expected to eventually clear without error and the buffer would eventually
clear with an error (i.e., the pqPutCopyEnd should have returned -1)? Does
it even matter what the end result of the copy should have been?

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/PQputCopyEnd-doesn-t-adhere-to-its-API-contract-tp5803240p5818253.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G Johnston 2014-09-08 22:24:58 Re: PQputCopyEnd doesn't adhere to its API contract
Previous Message Tomas Vondra 2014-09-08 21:53:40 Re: bad estimation together with large work_mem generates terrible slow hash joins