Re: psql leaks memory on query cancellation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: psql leaks memory on query cancellation
Date: 2018-04-12 21:10:15
Message-ID: 18075.1523567415@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I imagine that this indicates that control-C processing allocates some
> memory it doesn't free, resulting in an "island" up at the end of memory
> that prevents glibc from releasing all the free memory it's got. Whether
> that's an actual leak, or just memory we're holding onto in hopes of
> reusing it, isn't clear. (valgrind might be useful.)

So I poked into this a little, and realized that the "island" is in
fact the error PGresult we get back from the server when the query is
cancelled. We used to print that and throw it away ... but since the
addition of \errverbose, psql just sits on it, at least till the next
error. Since that's up against the end of memory, it prevents freeing
all the space belonging to the huge query result we cancelled midway
through collection of.

The problem, therefore, is that libpq reads the error message and builds a
PGresult for it before it throws away the partially-collected query result
it was working on. This is dumb. Quite aside from the question of when
memory can be released back to the OS, we are putting ourselves at
unnecessary risk of OOM.

Therefore, I propose the attached patch, which simply sees to it that
we discard any partial query result at the start of error message
collection not the end. This makes the behavior very much better,
at least on Linux.

I think this is a back-patchable bug fix; certainly so at least back
to 9.6 where \errverbose was added. Versions before that do not show
the persistent memory bloat the OP is complaining of, so that what
we have here is arguably a performance regression. Comments?

> BTW, I also notice that either psql or libpq seems to take a darn
> long time to release a several-GB-sized query result. That might
> be improvable as well.

I looked into that too, and found that what I was seeing was that
if I exit my pager with "q", psql continues to run, formatting the
enormous query result and writing it to nowhere. Since we've got
SIGPIPE disabled, nothing happens to stop it; although if you have
the presence of mind to press control-C, you do get control back
pretty quickly. I wonder whether fe_utils/print.c ought to be
adjusted to stop on output errors, along the lines of

- if (cancel_pressed)
+ if (cancel_pressed || ferror(fout))
break;

However, the implications of that aren't entirely clear, so I merely
suggest that as a topic for research not an immediately proposed patch.

regards, tom lane

Attachment Content-Type Size
smarter-libpq-error-handling.patch text/x-diff 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-04-12 21:58:59 Re: crash with sql language partition support function
Previous Message Christoph Berg 2018-04-12 20:27:18 Re: submake-errcodes