Re: Ignoring BRIN for HOT udpates seems broken

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Ignoring BRIN for HOT udpates seems broken
Date: 2022-05-28 20:50:59
Message-ID: 00babc97-7fa9-8b1d-fce2-066de9b8a25d@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/28/22 21:24, Matthias van de Meent wrote:
> On Sat, 28 May 2022 at 16:51, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> Hi,
>>
>> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
>> ignoring BRIN indexes for HOT is likely broken. Consider this example:
>>
>> ----------------------------------------------------------------------
>> CREATE TABLE t (a INT) WITH (fillfactor = 10);
>>
>> INSERT INTO t SELECT i
>> FROM generate_series(0,100000) s(i);
>>
>> CREATE INDEX ON t USING BRIN (a);
>>
>> UPDATE t SET a = 0 WHERE random() < 0.01;
>>
>> SET enable_seqscan = off;
>> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>>
>> SET enable_seqscan = on;
>> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>> ----------------------------------------------------------------------
>>
>> which unfortunately produces this:
>>
>> QUERY PLAN
>> ---------------------------------------------------------------
>> Bitmap Heap Scan on t (actual rows=23 loops=1)
>> Recheck Cond: (a = 0)
>> Rows Removed by Index Recheck: 2793
>> Heap Blocks: lossy=128
>> -> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
>> Index Cond: (a = 0)
>> Planning Time: 0.049 ms
>> Execution Time: 0.424 ms
>> (8 rows)
>>
>> SET
>> QUERY PLAN
>> -----------------------------------------
>> Seq Scan on t (actual rows=995 loops=1)
>> Filter: (a = 0)
>> Rows Removed by Filter: 99006
>> Planning Time: 0.027 ms
>> Execution Time: 7.670 ms
>> (5 rows)
>>
>> That is, the index fails to return some of the rows :-(
>>
>> I don't remember the exact reasoning behind the commit, but the commit
>> message justifies the change like this:
>>
>> There are no index pointers to individual tuples in BRIN, and the
>> page range summary will be updated anyway as it relies on visibility
>> info.
>>
>> AFAICS that's a misunderstanding of how BRIN uses visibility map, or
>> rather does not use. In particular, bringetbitmap() does not look at the
>> vm at all, so it'll produce incomplete bitmap.
>>
>> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
>> live issue. I don't quite see if this can be salvaged - I'll think about
>> this a bit more, but it'll probably end with a revert.
>
> The principle of 'amhotblocking' for only blocking HOT updates seems
> correct, except for the fact that the HOT flag bit is also used as a
> way to block the propagation of new values to existing indexes.
>
> A better abstraction would be "amSummarizes[Block]', in which updates
> that only modify columns that are only included in summarizing indexes
> still allow HOT, but still will see an update call to all (relevant?)
> summarizing indexes. That should still improve performance
> significantly for the relevant cases.
>

Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2022-05-29 12:25:12 RE: Perform streaming logical transactions by background workers and parallel apply
Previous Message Andres Freund 2022-05-28 20:22:45 Re: fix stats_fetch_consistency value in postgresql.conf.sample