Re: Use of additional index columns in rows filtering

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Maxim Ivanov <hi(at)yamlcoder(dot)me>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, markus(dot)winand(at)winand(dot)at
Subject: Re: Use of additional index columns in rows filtering
Date: 2023-08-03 18:17:05
Message-ID: afd0fdad-7860-dcb3-10a1-caf0a76c3fe0@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/3/23 18:47, Peter Geoghegan wrote:
> On Thu, Aug 3, 2023 at 4:20 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> Which is just the 7 buffers ...
>>
>> Did I do something wrong?
>
> I think that it might have something to do with your autovacuum
> settings. Note that the plan that you've shown for the master branch
> isn't the same one that appears in
> src/test/regress/expected/create_index.out for the master branch --
> that plan (the BitmapOr plan) was my baseline case for master.
>
> That said, I am a little surprised that you could ever get the plan
> that you showed for master (without somehow unnaturally forcing it).
> It's almost the same plan that your patch gets, but much worse. Your
> patch can use an index filter, but master uses a table filter instead.
>

Well I did force it - I thought we're talking about regular index scans,
so I disabled bitmap scans. Without doing that I get the BitmapOr plan
like you.

However, with the patch I get this behavior (starting from a "fresh"
state right after "make installcheck")

QUERY PLAN
------------------------------------------------------------------------
Index Scan using tenk1_thous_tenthous on tenk1
(cost=0.29..8.38 rows=1 width=244)
(actual time=0.033..0.036 rows=1 loops=1)
Index Cond: (thousand = 42)
Index Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Rows Removed by Index Recheck: 9
Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Buffers: shared read=4
Planning:
Buffers: shared hit=119 read=32
Planning Time: 0.673 ms
Execution Time: 0.116 ms
(10 rows)

insert into tenk1 (thousand, tenthous) select 42, i from
generate_series(43, 1000) i;

QUERY PLAN
------------------------------------------------------------------------
Index Scan using tenk1_thous_tenthous on tenk1
(cost=0.29..8.38 rows=1 width=244)
(actual time=0.038..0.605 rows=1 loops=1)
Index Cond: (thousand = 42)
Index Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Rows Removed by Index Recheck: 967
Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Buffers: shared hit=336
Planning Time: 0.114 ms
Execution Time: 0.632 ms
(8 rows)

analyze tenk1;

QUERY PLAN
------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=12.89..16.91 rows=1 width=244)
(actual time=0.016..0.019 rows=1 loops=1)
Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR
((thousand = 42) AND (tenthous = 3)) OR
((thousand = 42) AND (tenthous = 42)))
Heap Blocks: exact=1
Buffers: shared hit=7
-> BitmapOr ...
Buffers: shared hit=6
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 1))
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 3))
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 42))
Buffers: shared hit=2
Planning Time: 0.344 ms
Execution Time: 0.044 ms
(19 rows)

vacuum analyze tenk1;

QUERY PLAN
------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=12.89..16.91 rows=1 width=244)
(actual time=0.017..0.019 rows=1 loops=1)
Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR
((thousand = 42) AND (tenthous = 3)) OR
((thousand = 42) AND (tenthous = 42)))
Heap Blocks: exact=1
Buffers: shared hit=7
-> BitmapOr ...
Buffers: shared hit=6
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 1))
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 3))
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_thous_tenthous ...
Index Cond: ((thousand = 42) AND (tenthous = 42))
Buffers: shared hit=2
Planning Time: 0.277 ms
Execution Time: 0.046 ms
(19 rows)

set enable_bitmapscan = off;

QUERY PLAN
------------------------------------------------------------------------
Index Scan using tenk1_thous_tenthous on tenk1
(cost=0.29..23.57 rows=1 width=244)
(actual time=0.042..0.235 rows=1 loops=1)
Index Cond: (thousand = 42)
Index Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Rows Removed by Index Recheck: 967
Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42))
Buffers: shared hit=7
Planning Time: 0.119 ms
Execution Time: 0.261 ms
(8 rows)

So yeah, it gets

Buffers: shared hit=336

right after the insert, but it seems to be mostly about visibility map
(and having to fetch heap tuples), as it disappears after vacuum.

There seems to be some increase in cost, so we switch back to the bitmap
plan. I haven't looked into that, but I guess there's either some thinko
in the costing change, or maybe it's due to correlation.

> While the plan used by the patch is risky in the way that I described,
> the plan you saw for master is just horrible. I mean, it's not even
> risky -- it seems almost certain to lose. Whereas at least the plan
> from the patch really is cheaper than the BitmapOr plan (the master
> branch plan from create_index.out) on average.
>

Not sure. I'm a bit confused about what exactly is so risky on the plan
produced with the patch.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-08-03 18:50:41 Re: Use of additional index columns in rows filtering
Previous Message Tim Palmer 2023-08-03 18:08:28 Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)