avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
Date: 2020-07-04 00:45:03
Message-ID: 20200704004503.GA23805@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(resending to the list)

Hi All

I started looking into Konstantin's 30 month old thread/patch:
|Re: [HACKERS] Secondary index access optimizations
https://www.postgresql.org/message-id/27516421-5afa-203c-e22a-8407e9187327%40postgrespro.ru

..to which David directed me 12 months ago:
|Subject: Re: scans on table fail to be excluded by partition bounds
https://www.postgresql.org/message-id/CAKJS1f_iOmCW11dFzifpDGUgSLAoSTDOjw2tcec%3D7Cgq%2BsR80Q%40mail.gmail.com

My complaint at the time was for a query plan like:

CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,10)i; \gexec
INSERT INTO p SELECT i%99, i%9 FROM generate_series(1,99999)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);

postgres=# explain analyze SELECT * FROM p WHERE (i=10 OR i=20 OR i=30) AND j<2;
Append (cost=28.51..283.25 rows=546 width=12) (actual time=0.100..1.364 rows=546 loops=1)
-> Bitmap Heap Scan on p2 (cost=28.51..93.51 rows=182 width=12) (actual time=0.099..0.452 rows=182 loops=1)
Recheck Cond: ((i = 10) OR (i = 20) OR (i = 30))
Filter: (j < 2)
Rows Removed by Filter: 818
Heap Blocks: exact=45
-> BitmapOr (cost=28.51..28.51 rows=1000 width=0) (actual time=0.083..0.083 rows=0 loops=1)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..19.79 rows=1000 width=0) (actual time=0.074..0.074 rows=1000 loops=1)
Index Cond: (i = 10)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (i = 20)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (i = 30)
...

This 2nd and 3rd index scan on p2_i_idx are useless, and benign here, but
harmful if we have a large OR list.

I tried rebasing Konstantin's patch, but it didn't handle the case of
"refuting" inconsistent arms of an "OR" list, so I came up with this. This
currently depends on the earlier patch only to call RelationGetPartitionQual,
so appears to be mostly a separate issue.

I believe the current behavior of "OR" lists is also causing another issue I
reported, which a customer hit again last week:
https://www.postgresql.org/message-id/20191216184906.GA2082@telsasoft.com
|ERROR: could not resize shared memory segment...No space left on device

When I looked into it, their explain(format text) was 50MB, due to a list of
~500 "OR" conditions, *each* of which was causing an index scan for each of
~500 partitions, where only one index scan per partition was needed or useful,
all the others being inconsistent with the partition constraint. Thus the
query ultimately errors when it exceeds a resource limit (maybe no surprise
with 8500 index scans).

Here, I was trying to create a test case reproducing that error to see if this
resolves it, but so far hasn't failed. Tomas has a reproducer with a different
(much simpler) plan, though.

CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
\pset pager off
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,500)i;
\timing off
\set quiet
\set echo always
\gexec
\timing on
INSERT INTO p SELECT i%5000, i%500 FROM generate_series(1,9999999)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);
SELECT format('explain analyze SELECT * FROM p WHERE %s', array_to_string(array_agg('i='||(i*10)::text),' OR ')) FROM generate_series(1,500)i;

--
Justin

Attachment Content-Type Size
0001-Secondary-index-access-optimizations.patch text/x-diff 87.2 KB
0002-Avoid-index-scan-inconsistent-with-partition-constra.patch text/x-diff 3.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-04 03:08:16 Ideas about a better API for postgres_fdw remote estimates
Previous Message Tom Lane 2020-07-03 21:50:43 Re: estimation problems for DISTINCT ON with FDW