Re: OOM in libpq and infinite loop with getCopyStart()

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: OOM in libpq and infinite loop with getCopyStart()
Date: 2016-04-19 07:19:37
Message-ID: CAB7nPqQMe1vnHZX5wr_Oi57ORQ=f7_BC9Eegughf+GrSBMDHeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 18, 2016 at 4:48 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> Another, perhaps, better idea is to add some more extra logic to
> pg_conn as follows:
> bool is_emergency;
> PGresult *emergency_result;
> bool is_emergency_consumed;
> So as when an OOM shows up, is_emergency is set to true. Then a first
> call of PQgetResult returns the PGresult with the OOM in
> emergency_result, setting is_emergency_consumed to true and switching
> is_emergency back to false. Then a second call to PQgetResult enforces
> the consumption of any results remaining without waiting for them, at
> the end returning NULL to the caller, resetting is_emergency_consumed
> to false. That's different compared to the extra failure
> PGASYNC_COPY_FAILED in the fact that it does not break PQisBusy().

So, in terms of code, it gives more or less the attached. I have
coupled the emergency_result with an enum flag emergency_status that
has 3 states:
- NONE, which is the default
- ENABLE, which is when an OOM or other error has been found in libpq.
Making the next call of PQgetResult return the emergency_result
- CONSUMED, set after PQgetResult is consuming emergency_result, to
return to caller NULL so as it can get out of any PQgetResult loop
expecting a result at the end of. Once NULL is returned, the flag is
switched back to NONE.
This works in the case of getCopyStart because the OOM failures are
happening before any messages are sent to server.

The checks for the emergency result are done in PQgetResult, so as any
upper-level routine gets the message. Tom, others, what do you think?
--
Michael

Attachment Content-Type Size
libpq-oom-copy-v2.patch text/x-diff 3.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2016-04-19 08:03:41 Re: [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.
Previous Message Michael Paquier 2016-04-19 05:47:15 Re: VS 2015 support in src/tools/msvc