Re: OOM in libpq and infinite loop with getCopyStart()

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: OOM in libpq and infinite loop with getCopyStart()
Date: 2016-04-01 07:05:44
Message-ID: CAA4eK1+_gHzRNkBZNAQTAD6C5a03Kfi5bg7GaLum5p5Ken-NEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 1, 2016 at 10:34 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
> On Fri, Apr 1, 2016 at 12:48 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I thought about this patch a bit more...
> >
> > When I first looked at the patch, I didn't believe that it worked at
> > all: it failed to return a PGRES_COPY_XXX result to the application,
> > and yet nonetheless switched libpq's asyncStatus to PGASYNC_COPY_XXX?
> > Wouldn't things be hopelessly confused? I tried it out and saw that
> > indeed it seemed to work in psql, and after tracing through that found
> > that psql has no idea what's going on, but when psql issues its next
> > command PQexecStart manages to get us out of the copy state (barring
> > more OOM failures, anyway). That seems a bit accidental, though,
> > and for sure it wasn't documented in the patch.
>
> From the patch, that's mentioned:
> + * Stop if we are in copy mode and error has occurred, the pending
results
> + * will be discarded during next execution in PQexecStart.
>

Yeah, and same is mentioned in PQexecStart function as well which indicates
that above assumption is right.

>
> > Anyway, the short of my review is that we need more clarity of thought
> > about what state libpq is in after a failure like this, and what that
> > state looks like to the application, and how that should affect how
> > libpq reacts to application calls.
>

Very valid point. So, if we see with patch, I think libpq will be
in PGASYNC_COPY_XXX state after such a failure and next time when it tries
to again execute statement, it will end copy mode and then allow to proceed
from there. This design is required for other purposes like silently
discarding left over result. Now, I think the other option(if possible)
seems to be that we can put libpq in PGASYNC_IDLE, if we get such an error
as we do at some other places in case of error as below and return OOM
failure as we do in patch.

PQgetResult()

{

..

if (flushResult ||

pqWait(TRUE, FALSE, conn) ||

pqReadData(conn) < 0)

{

/*

* conn->errorMessage has been set by pqWait or pqReadData. We

* want to append it to any already-received error message.

*/

pqSaveErrorResult(conn);

conn->asyncStatus = PGASYNC_IDLE;

return pqPrepareAsyncResult(conn);

}
..
}

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-04-01 07:18:29 Re: Incorrect format in error message
Previous Message Michael Paquier 2016-04-01 07:04:32 Re: Speedup twophase transactions