Re: PQexec() hangs on OOM

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: PQexec() hangs on OOM
Date: 2015-09-14 01:36:32
Message-ID: CAB7nPqQYOf-MYtgJ5UkvpNKOqOOG-vnXsTs=PJd6gsMDM8LS3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-bugs

On Sat, Sep 12, 2015 at 6:11 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Sep 7, 2015 at 1:40 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
>>
>> On Sat, Sep 5, 2015 at 9:45 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>>
>> Now, to move into the serious things...
>>
>> + /*
>> + * Advance inStart to show that the copy related message has been
>> + * processed.
>> + */
>> + conn->inStart = conn->inCursor;
>> This change...
>>
>> + /* getCopyStart() moves
>> inStart itself */
>> conn->asyncStatus =
>> PGASYNC_COPY_IN;
>> - break;
>> + continue;
>> ... And this change are risky for a backpatch. And they actually
>> break the existing replication protocol

First, many thanks for providing your input! I am really looking
forward into fixing those problems appropriately.

> Can you please explain how will it break replication protocol?
> I have done the required handling for Copy Both mode as well in attached
> patch similar to what was done for other Copy modes in previous patch.
> Check if you still find it as broken for replication?

When not enough data has been received from the server, it seems that
we should PQclear the existing result, and leave immediately
pqParseInput3 with the status PGASYNC_BUSY to be able to wait for more
data. Your patch, as-is, is breaking that promise (and this comes from
the first versions of my patches). It seems also that we should not
write an error message on connection in this case to be consistent in
the behavior of back-branches. For the OOM case, we definitely need to
take the error code path though, as your patch is correctly doing, and
what mine legendary failed to do.

+ if (pqGetInt(&result->numAttributes, 2, conn))
+ {
+ errmsg = libpq_gettext("extraneous data in COPY start message");
+ goto advance_and_error;
+ }
Here an error message is not needed, and this message should have been
formulated as "insufficient data in blah" either way.

> I have only kept the changes for COPY modes, so that once we settle on
> those, I think similar changes could be done for getParamDescriptions()

Yeah, I looked into this one as well, resulting in patch 0002
attached. In this case we have the advantage to only receive data from
the server, so pqPrepareAsyncResult is handling the failure just fine.
Attached as well is the test case I used (similar to previous one,
just extended a bit to report the error message), after getting the
result from PQsendDescribePrepared, PQresultStatus is set correctly on
error and reports "out of memory" to the user. What do you think about
it?
Regards,
--
Michael

Attachment Content-Type Size
0001-Fix-OOM-error-handling-in-COPY-protocol-of-libpq.patch text/x-patch 7.1 KB
0002-Fix-OOM-error-handling-in-BIND-protocol-of-libpq.patch text/x-patch 3.6 KB
0003-Remove-dead-code-of-libpq-protocol-3.patch text/x-patch 4.8 KB
extended.c text/x-csrc 1.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tatsuo Ishii 2015-09-14 04:15:34 Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')
Previous Message Michael Paquier 2015-09-13 22:57:11 Re: PQexec() hangs on OOM