Re: brininsert optimization opportunity

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ashwin Agrawal <ashwinstar(at)gmail(dot)com>
Subject: Re: brininsert optimization opportunity
Date: 2023-12-12 11:25:23
Message-ID: d4694006-5578-61a0-24e9-d3015dce4974@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/11/23 16:41, Tomas Vondra wrote:
> On 12/11/23 16:00, Alexander Lakhin wrote:
>> Hello Tomas and Soumyadeep,
>>
>> 25.11.2023 23:06, Tomas Vondra wrote:
>>> I've done a bit more cleanup on the last version of the patch (renamed
>>> the fields to start with bis_ as agreed, rephrased the comments / docs /
>>> commit message a bit) and pushed.
>>
>> Please look at a query, which produces warnings similar to the ones
>> observed upthread:
>> CREATE TABLE t(a INT);
>> INSERT INTO t SELECT x FROM generate_series(1,10000) x;
>> CREATE INDEX idx ON t USING brin (a);
>> REINDEX index CONCURRENTLY idx;
>>
>> WARNING:  resource was not closed: [1863] (rel=base/16384/16389,
>> blockNum=1, flags=0x93800000, refcount=1 1)
>> WARNING:  resource was not closed: [1862] (rel=base/16384/16389,
>> blockNum=0, flags=0x93800000, refcount=1 1)
>>
>> The first bad commit for this anomaly is c1ec02be1.
>>
>
> Thanks for the report. I haven't investigated what the issue is, but it
> seems we fail to release the buffers in some situations - I'd bet we
> fail to call the cleanup callback in some place, or something like that.
> I'll take a look tomorrow.
>

Yeah, just as I expected this seems to be a case of failing to call the
index_insert_cleanup after doing some inserts - in this case the inserts
happen in table_index_validate_scan, but validate_index has no idea it
needs to do the cleanup.

The attached 0001 patch fixes this by adding the call to validate_index,
which seems like the proper place as it's where the indexInfo is
allocated and destroyed.

But this makes me think - are there any other places that might call
index_insert without then also doing the cleanup? I did grep for the
index_insert() calls, and it seems OK except for this one.

There's a call in toast_internals.c, but that seems OK because that only
deals with btree indexes (and those don't need any cleanup). The same
logic applies to unique_key_recheck(). The rest goes through
execIndexing.c, which should do the cleanup in ExecCloseIndices().

Note: We should probably call the cleanup even in the btree cases, even
if only to make it clear it needs to be called after index_insert().

I was thinking maybe we should have some explicit call to destroy the
IndexInfo, but that seems rather inconvenient - it'd force everyone to
carefully track lifetimes of the IndexInfo instead of just relying on
memory context reset/destruction. That seems quite error-prone.

I propose we do a much simpler thing instead - allow the cache may be
initialized / cleaned up repeatedly, and make sure it gets reset at
convenient place (typically after index_insert calls that don't go
through execIndexing). That'd mean the cleanup does not need to happen
very far from the index_insert(), which makes the reasoning much easier.
0002 does this.

But maybe there's a better way to ensure the cleanup gets called even
when not using execIndexing.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-call-cleanup-in-validate_index.patch text/x-patch 839 bytes
0002-call-index_insert_cleanup-repeatedly.patch text/x-patch 2.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2023-12-12 11:29:03 Re: Show WAL write and fsync stats in pg_stat_io
Previous Message Jehan-Guillaume de Rorthais 2023-12-12 10:52:46 Re: Sorting regression of text function result since commit 586b98fdf1aae