Re: Should pg 11 use a lot more memory building an spgist index?

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Alexander Korotkov <akorotkov(at)postgresql(dot)org>, Bruno Wolff III <bruno(at)wolff(dot)to>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Should pg 11 use a lot more memory building an spgist index?
Date: 2018-10-30 10:10:31
Message-ID: f7da983e-e489-3f05-5446-48812c8e35fc@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 2018/10/30 4:48, Tom Lane wrote:
> I wrote:
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> How about modifying SysScanDescData to have a memory context member,
>>> which is created by systable_beginscan and destroyed by endscan?
>
>> I think it would still have problems, in that it would affect in which
>> context index_getnext delivers its output. We could reasonably make
>> these sorts of changes in places where the entire index_beginscan/
>> index_getnext/index_endscan call structure is in one place, but that
>> doesn't apply for the systable functions.
>
> Actually, after looking more closely, index_getnext doesn't return a
> palloc'd object anyway, or at least not one that the caller is responsible
> for managing. So maybe that could work.

Instead of SysScanDescData, could we add one to IndexScanData? It's
somewhat clear that catalog scans don't leak much, but user index scans
can, as seen in this case.

In this particular case, I see that it's IndexCheckExclusion() that
invokes check_unique_or_exclusion_constraint() repeatedly for each heap
tuple after finishing building index on the heap. The latter performs
index_beginscan/endscan() for every heap tuple, but doesn't bother to
release the memory allocated for IndexScanDesc and its members.

I've tried to fix that with the attached patches.

0001 adds the ability for the callers of index_beginscan to specify a
memory context. index_beginscan_internals switches to that context before
calling ambeginscan and stores into a new field xs_scan_cxt of
IndexScanData, but maybe storing it is unnecessary.

0002 builds on that and adds the ability for the callers of
check_exclusion_or_unique_constraints to specify a memory context. Using
that, it fixes the leak by teaching IndexCheckExclusion and
ExecIndexInsertTuples to pass a memory context that's reset before calling
check_exclusion_or_unique_constraints() for the next tuple.

The following example shows that the patch works.

With HEAD:

create table foo (a int4range);
alter table foo add exclude using spgist (a with &&);
-- this consumes about 1GB
insert into foo select int4range(i,i+1) from generate_series(1, 100000) i;
alter table foo drop constraint foo_a_excl;
-- this too
alter table foo add exclude using spgist (a with &&);

Patched:

create table foo (a int4range);
alter table foo add exclude using spgist (a with &&);
-- memory consumption doesn't grow above 37MB
insert into foo select int4range(i,i+1) from generate_series(1, 100000) i;
alter table foo drop constraint foo_a_excl;
-- ditto
alter table foo add exclude using spgist (a with &&);

Thoughts?

Thanks,
Amit

Attachment Content-Type Size
v1-0001-Allow-IndexScanDesc-allocation-in-caller-specifie.patch text/plain 9.4 KB
v1-0002-Add-a-MemoryContext-arg-to-check_exclusion_or_uni.patch text/plain 6.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message GPT 2018-10-30 10:19:20 Re: rw_redis_fdw: SQL Errors when statement is within a function
Previous Message Devart 2018-10-30 09:12:45 Connectivity Support for PostgreSQL 11 in dbForge Data Compare for PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas 'ads' Scherbaum 2018-10-30 10:14:00 Re: INSTALL file
Previous Message Krzysztof Nienartowicz 2018-10-30 09:54:06 Re: Speeding up INSERTs and UPDATEs to partitioned tables