Re: Is there a memory leak in commit 8561e48?

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "jian(dot)long(at)i-soft(dot)com(dot)cn" <jian(dot)long(at)i-soft(dot)com(dot)cn>
Subject: Re: Is there a memory leak in commit 8561e48?
Date: 2018-04-20 00:49:05
Message-ID: 20180420004905.GE2024@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 19, 2018 at 03:10:42PM +0900, Michael Paquier wrote:
> You are right. I can easily see the leak if I use for example a
> background worker which connects to a database, and launches many
> transactions in a row. The laziest reproducer I have is to patch one of
> my bgworkers to launch millions of transactions in a tight loop and the
> leak is plain (this counts relations automatically, does not matter):
> https://github.com/michaelpq/pg_plugins/tree/master/count_relations
>
> TopMemoryContext is associated to a session, so the comment in
> AtEOXact_SPI() is wrong.

I have been looking at this one this morning, and I can see at least two
problems:
1) When SPI_connect_ext is used in an atomic context, then the
allocation of _SPI_stack should happen in TopTransactionContext instead
of TopMemoryContext. This way, any callers of SPI_connect would not be
impacted by any memory leaks.
2) Error stacks with non-atomic calls leak memorya anyway. It is easy
to find leaks of _SPI_stack in the regression tests when an ERROR
happens in a PL call which lead to AtEOXact_SPI being called in
AbortTransaction. See that as an example:
@@ -283,6 +285,12 @@ AtEOXact_SPI(bool isCommit)
errmsg("transaction left non-empty SPI stack"),
errhint("Check for missing \"SPI_finish\" calls.")));

+ if (_SPI_current != NULL && !_SPI_current->atomic && _SPI_stack != NULL)
+ ereport(WARNING,
+ (errcode(ERRCODE_WARNING),
+ errmsg("non-atomic transaction left non-empty SPI stack"),
+ errhint("Check after non-atomic \"SPI_connect_ext\" calls.")));

The cleanest approach I can think about is to have SPI use its own
memory context which gets cleaned up in AtEOXact_SPI, but I would like
to hear more from Peter Eisentraut and Andrew Dunstand first as
author/committer and reviewer as it is their stuff.

I am attaching a preliminary patch, which fixes partially the leak, but
that does not take care of the leaks caused by the error stacks.
--
Michael

Attachment Content-Type Size
spi_patch.patch text/x-diff 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-04-20 00:52:02 Re: Corrupted btree index on HEAD because of covering indexes
Previous Message Amit Langote 2018-04-20 00:48:50 Re: Oddity in tuple routing for foreign partitions