BUG #16533: Planner optimisation : range predicate not propagating to joined tables

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: barney(at)c8software(dot)com(dot)au
Subject: BUG #16533: Planner optimisation : range predicate not propagating to joined tables
Date: 2020-07-10 02:42:54
Message-ID: 16533-a10051f0f1d8d72f@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16533
Logged by: Andrew Barnham
Email address: barney(at)c8software(dot)com(dot)au
PostgreSQL version: 12.3
Operating system: Linux
Description:

Hi

Included repeatable SQL of issue below. I am trying to paginate data using
limit and value range offset. Ordinarily a query like this:

select * from
tmp_parent
left outer join tmp_child on (tmp_parent.id=tmp_child.id)
where tmp_parent.id=8000000;

Would copy the predicate (id=8000000) to the child. So resulting plan uses a
merge and in the merge like so:

-> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..8.44 rows=1
width=10) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (id = 8000000)
-> Index Scan using tmp_b_idx on tmp_child (cost=0.29..8.31 rows=1
width=10) (never executed)
Index Cond: (id = 8000000) **** PREDICATE COPIED TO CHILD TABLE
ALL OK

But if I have query like this:

select * from
tmp_parent
left outer join tmp_child on (tmp_parent.id=tmp_child.id)
where tmp_parent.id>=8000000 limit 100;

Resulting plan does not apply a index cond to the child table and it merge
scans, from start of childs index needlessly reading 80k records

-> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..7008.03
rows=206606 width=10) (actual time=0.049..0.077 rows=100 loops=1)
Index Cond: (id >= 8000000)
-> Index Scan using tmp_b_idx on tmp_child (cost=0.29..3159.89
rows=100373 width=10) (actual time=0.013..28.105 rows=80346 loops=1)

If I constrain ID range it is better, it reverts to a nested join
although.
-> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..10.49 rows=103
width=10) (actual time=0.022..0.065 rows=100 loops=1)
Index Cond: ((id >= 8000000) AND (id <= 8001000))
-> Index Scan using tmp_b_idx on tmp_child (cost=0.29..7.38 rows=1
width=10) (actual time=0.002..0.002 rows=0 loops=100)
Index Cond: (tmp_parent.id = id) **** PRETTY GOOD BUT NEEDS TO
WALK INTO THE INDEX 100 TIMES

I can get a great result by moving the predicate into the join. But this
isn't really desirable because in my real world query I am trying to
abstract a very complex query behind a view, but has performance issue that
boil down to this repeatable example, of propagation of the predicate

select * from
tmp_parent
left outer join tmp_child on (tmp_parent.id=tmp_child.id and
tmp_child.id>=8000000)
where tmp_parent.id>=8000000
order by tmp_parent.id limit 100;

Merge Cond: (tmp_parent.id = tmp_child.id)
-> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..7008.03
rows=206606 width=10) (actual time=0.019..0.079 rows=100 loops=1)
Index Cond: (id >= 8000000)
-> Index Scan using tmp_b_idx on tmp_child (cost=0.29..686.24
rows=19997 width=10) (actual time=0.018..0.022 rows=14 loops=1)
Index Cond: (id >= 8000000)
**** IDEAL QUERY PLAN

With thanks
Andrew
=======

create temporary table tmp_parent as
select generate_series(1,1000000)*10+floor(random()*10)::int as
"id",'Text'||((random()*100)::integer) as "description";

create temporary table tmp_child as
select * from tmp_parent where random()<0.1;

create unique index tmp_a_idx on tmp_parent (id);
create unique index tmp_b_idx on tmp_child (id);

analyze tmp_parent;
analyze tmp_child;

explain analyze
select * from
tmp_parent
left outer join tmp_child on (tmp_parent.id=tmp_child.id)
where tmp_parent.id>=8000000
order by tmp_parent.id limit 100;

explain analyze
select * from
tmp_parent
left outer join tmp_child on (tmp_parent.id=tmp_child.id and
tmp_child.id>=8000000)
where tmp_parent.id>=8000000
order by tmp_parent.id limit 100;

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Langote 2020-07-10 06:23:28 Re: posgres 12 bug (partitioned table)
Previous Message Michael Paquier 2020-07-10 01:53:54 Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows