Re: Should pg 11 use a lot more memory building an spgist index?

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <akorotkov(at)postgresql(dot)org>, Bruno Wolff III <bruno(at)wolff(dot)to>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Should pg 11 use a lot more memory building an spgist index?
Date: 2018-10-30 07:30:22
Message-ID: 023391a7-55db-6948-71ff-328218b540de@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 2018/10/30 4:48, Tom Lane wrote:
> I was confused about why the memory leak in Bruno's example is so much
> larger in HEAD than v11; spgbeginscan does allocate more stuff than
> before, but only if numberOfOrderBys > 0, which surely doesn't apply for
> the exclusion-check code path. Eventually I realized that the difference
> is because commit 2a6368343 made SpGistScanOpaqueData a good deal larger
> than it had been (10080 vs 6264 bytes, on my x86_64 box), so it was just
> the failure to pfree that struct that accounted for the bigger leak.
>
> There's another issue with 2a6368343, though: it added a couple of
> fmgr_info_copy calls to spgbeginscan. I don't understand why it'd be a
> good idea to do that rather than using the FmgrInfos in the index's
> relcache entry directly. Making a copy every time risks a memory leak,
> because spgendscan has no way to free any fn_extra data that gets
> allocated by the called function; and by the same token it's inefficient,
> because the function has no way to cache fn_extra data across queries.
>
> If we consider only the leak aspect, this coding presents a reason why
> we should try to change things as Alvaro suggests. However, because of
> the poor-caching aspect, I think this code is pretty broken anyway,
> and once we fix it the issue is much less clear-cut.
>
> (Looking around, it seems like we're very schizophrenic about whether
> to copy index relcache support function FmgrInfos or just use them
> directly. But nbtree and hash seem to always do the latter, so I think
> it's probably reasonable to standardize on that.)

Agreed about trying to avoid fmgr_info_copy(), at least in this case.

By the way, do I get it right that the reason to want to avoid copying in
this instance is that the currently active context could be a long-lived
one, as in the case of create index, alter table add constraint, etc.
calling the copying code for every tuple? There are other instances of
fmgr_info_copy throughout index AM implementations, including the helper
function ScanKeyEntryInitializeWithInfo() called from them, but the copies
made in those cases don't appear very leak-prone.

> Anyway, the attached proposed patch for HEAD makes Bruno's problem go
> away, and it also fixes an unrelated leak introduced by 2a6368343
> because it reallocates so->orderByTypes each time through spgrescan.
> I think we should apply this, and suitable subsets in the back branches,
> to fix the immediate problem. Then we can work at leisure on a more
> invasive HEAD-only patch, if anyone is excited about that.

Makes sense to fix it like this for back-patching.

Thanks,
Amit

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Devart 2018-10-30 09:12:45 Connectivity Support for PostgreSQL 11 in dbForge Data Compare for PostgreSQL
Previous Message Thomas Kellerer 2018-10-30 07:10:34 Re: Different memory allocation strategy in Postgres 11?

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2018-10-30 07:42:41 Re: Getting fancy errors when accessing information_schema on 10.5
Previous Message Amit Langote 2018-10-30 06:51:46 Re: ToDo: show size of partitioned table