Re: Ignoring BRIN for HOT udpates seems broken

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Ignoring BRIN for HOT udpates seems broken
Date: 2022-05-28 19:24:59
Message-ID: CAEze2Wjq6iiFGCiVCdT4AZCMByxVdMmGdhBJdgYMVPH2sdu=XQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 28 May 2022 at 16:51, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> Hi,
>
> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
> ignoring BRIN indexes for HOT is likely broken. Consider this example:
>
> ----------------------------------------------------------------------
> CREATE TABLE t (a INT) WITH (fillfactor = 10);
>
> INSERT INTO t SELECT i
> FROM generate_series(0,100000) s(i);
>
> CREATE INDEX ON t USING BRIN (a);
>
> UPDATE t SET a = 0 WHERE random() < 0.01;
>
> SET enable_seqscan = off;
> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>
> SET enable_seqscan = on;
> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
> ----------------------------------------------------------------------
>
> which unfortunately produces this:
>
> QUERY PLAN
> ---------------------------------------------------------------
> Bitmap Heap Scan on t (actual rows=23 loops=1)
> Recheck Cond: (a = 0)
> Rows Removed by Index Recheck: 2793
> Heap Blocks: lossy=128
> -> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
> Index Cond: (a = 0)
> Planning Time: 0.049 ms
> Execution Time: 0.424 ms
> (8 rows)
>
> SET
> QUERY PLAN
> -----------------------------------------
> Seq Scan on t (actual rows=995 loops=1)
> Filter: (a = 0)
> Rows Removed by Filter: 99006
> Planning Time: 0.027 ms
> Execution Time: 7.670 ms
> (5 rows)
>
> That is, the index fails to return some of the rows :-(
>
> I don't remember the exact reasoning behind the commit, but the commit
> message justifies the change like this:
>
> There are no index pointers to individual tuples in BRIN, and the
> page range summary will be updated anyway as it relies on visibility
> info.
>
> AFAICS that's a misunderstanding of how BRIN uses visibility map, or
> rather does not use. In particular, bringetbitmap() does not look at the
> vm at all, so it'll produce incomplete bitmap.
>
> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
> live issue. I don't quite see if this can be salvaged - I'll think about
> this a bit more, but it'll probably end with a revert.

The principle of 'amhotblocking' for only blocking HOT updates seems
correct, except for the fact that the HOT flag bit is also used as a
way to block the propagation of new values to existing indexes.

A better abstraction would be "amSummarizes[Block]', in which updates
that only modify columns that are only included in summarizing indexes
still allow HOT, but still will see an update call to all (relevant?)
summarizing indexes. That should still improve performance
significantly for the relevant cases.

-Matthias

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2022-05-28 19:53:20 Re: Compare variables of composite type with slightly different column types
Previous Message Andres Freund 2022-05-28 19:14:42 Re: [RFC] building postgres with meson