Re: [report] memory leaks in COPY FROM on partitioned table

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Kohei KaiGai <kaigai(at)heterodb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [report] memory leaks in COPY FROM on partitioned table
Date: 2018-07-24 08:50:09
Message-ID: 9deb7dc5-4b50-f1d7-3e73-2ff69a4f5122@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kaigai-san,

Thanks for the report and the patch.

On 2018/07/24 11:43, Kohei KaiGai wrote:
> Further investigation I did:
>
> CopyFrom() calls ExecFindPartition() to identify the destination child
> table of partitioned table.
> Then, it internally calls get_partition_for_tuple() to get partition
> index according to the key value.
> This invocation is not under the per-tuple context.
>
> In case of hash-partitioning, get_partition_for_tuple() calls
> hash-function of key data type; which is hash_numeric in my case.
> The hash_numeric fetches the key value using PG_GETARG_NUMERIC(0). It
> internally calls pg_detoast_datum() which may allocate new memory if
> varlena datum is not uncompressed long (32bit) format.
>
> Once this patch attached, PostgreSQL backend process has been working
> with about 130MB memory consumption for 20min right now (not finished
> yet...)
> Before the patch applied, its memory consumption grows up about
> 10BM/sec, then terminated a few hours later.
>
> P.S,
> I think do_convert_tuple() in ExecFindPartition() and
> ConvertPartitionTupleSlot() may also allocate memory out of the
> per-tuple context, however, I could not confirmed yet, because my test
> case has TupleConversionMap == NULL.

Your patch takes care of allocation happening inside
get_partition_for_tuple, but as you mention there might be others in its
caller ExecFindPartition. So, I think we should switch to the per-tuple
context in ExecFindPartition.

When I tried to do that, I discovered that we have to be careful about
releasing some of the memory that's allocated in ExecFindPartition
ourselves instead of relying on the reset of per-tuple context to take
care of it. That's because some of the structures that ExecFindPartition
assigns the allocated memory to are cleaned up at the end of the query, by
when it's too late to try to release per-tuple memory. So, the patch I
ended up with is slightly bigger than simply adding a
MemoryContextSwitchTo() call at the beginning of ExecFindPartition.
Please find it attached.

Thanks,
Amit

Attachment Content-Type Size
v1-0001-Use-per-tuple-memory-in-ExecFindPartition.patch text/plain 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-07-24 09:02:00 Re: Indicate anti-wraparound autovacuum in log_autovacuum_min_duration
Previous Message Daniel Gustafsson 2018-07-24 08:33:48 Re: Finding database for pg_upgrade missing library