Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dmitry Astapov <dastapov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Date: 2022-02-01 15:07:41
Message-ID: CAKU4AWpo9z0hMHDWUKuce4Z-NpcybV0J2UVu5+DVwyP-CrHCQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 14, 2021 at 12:22 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Fri, 14 May 2021 at 11:22, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I recall somebody (David Rowley, maybe? Too lazy to check archives.)
> > working on this idea awhile ago, but he didn't get to the point of
> > a committable patch.
>
> Yeah. Me. The discussion is in [1].
>
> David
>
> [1]
> https://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A%40mail.gmail.com

Hi:

I read through that thread and summarized the current pending issue as
below IIUC.
a). The most challenging issue is this push down misleads the planner's
rows estimation,
which probably be worse than the lack of such push down. b). The new
generated
qual may increase the qual execution cost. c). Planning time is also
increased but
we can not gain much because of that. I just tried to address these issues
as below
based on the patch David has finished a long time ago.

To address the row estimation issue, The most straightforward way to fix
this is to
ignore the derived clauses when figuring out the RelOptInfo->rows on base
relation.
To note which clause is derived from this patch, I added a new field
"EquivalenceClass *
derived" in RestrictInfo. and then added a included_derived option in
clauselist_selectivity_ext,
during the set_xxx_rel_size function, we can pass the
included_derived=false. This strategy
should be used in get_parameterized_baserel_size. In all the other cases,
include_derived=true
is used. which are finished in commit 2. (Commit 1 is Daivd's patch, I
just rebased it)

set enable_hashjoin to off;
set enable_mergejoin to off;
set enable_seqscan to on;
regression=# explain analyze select * from tenk1 a, tenk1 b where
a.thousand = b.thousand and a.thousand < 100;

QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=27.14..1090.67 rows=10740 width=488) (actual
time=0.404..15.006 rows=10000 loops=1)
-> Bitmap Heap Scan on tenk1 b (cost=26.84..385.26 rows=10000
width=244) (actual time=0.350..1.419 rows=1000 loops=1)
Recheck Cond: (thousand < 100)\
Heap Blocks: exact=324
-> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..24.34
rows=1074 width=0) (actual time=0.238..0.240 rows=1000 loops=1)
Index Cond: (thousand < 100)
-> Memoize (cost=0.30..0.47 rows=1 width=244) (actual
time=0.002..0.006 rows=10 loops=1000)
Cache Key: b.thousand
Cache Mode: logical
Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage:
277kB
-> Index Scan using tenk1_thous_tenthous on tenk1 a
(cost=0.29..0.46 rows=1 width=244) (actual time=0.010..0.032 rows=10
loops=100)
Index Cond: ((thousand = b.thousand) AND (thousand < 100))
Planning Time: 2.459 ms
Execution Time: 15.964 ms
(14 rows)

As shown above, with commit 2 the JoinRel's rows estimation is correct
now. but it will mislead
the DBA to read the plan. See Bitmap Heap Scan on tenk1 b
(...rows=10000..) (... rows=1000 loops=1)
This is because RelOptInfo->rows is not just used to calculate the
joinrel.rows but also be used to
show the set Path.rows at many places. I can't think of a better way than
adding a new filtered_rows
in RelOptInfo which the semantics is used for Path.rows purpose only. That
is what commit 3 does.

After commit 3, we can see:

regression=# explain analyze select * from tenk1 a, tenk1 b where
a.thousand = b.thousand and a.thousand < 100;

QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=24.90..459.16 rows=10740 width=488) (actual
time=0.440..16.966 rows=10000 loops=1)
-> Bitmap Heap Scan on tenk1 b (cost=24.61..383.03 rows=1074
width=244) (actual time=0.383..1.546 rows=1000 loops=1)
Recheck Cond: (thousand < 100)
Heap Blocks: exact=324
-> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..24.34
rows=1074 width=0) (actual time=0.270..0.272 rows=1000 loops=1)
Index Cond: (thousand < 100)
-> Memoize (cost=0.30..0.47 rows=1 width=244) (actual
time=0.002..0.008 rows=10 loops=1000)
Cache Key: b.thousand
Cache Mode: logical
Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage:
277kB
-> Index Scan using tenk1_thous_tenthous on tenk1 a
(cost=0.29..0.46 rows=1 width=244) (actual time=0.012..0.050 rows=10
loops=100)
Index Cond: ((thousand = b.thousand) AND (thousand < 100))
Planning Time: 2.578 ms
Execution Time: 17.929 ms
(14 rows)

"Bitmap Heap Scan on tenk1 b (... rows=1074 ..) (.. rows=1000 loops=1)"
shows the issue fixed. but
There is something wrong below.

Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.29..0.46 rows=1
width=244) (actual time=0.012..0.050 rows=10 loops=100)
Index Cond: ((thousand = b.thousand) AND (thousand < 100))

Here the " (thousand < 100)" is from the user, not from this patch. and
(thousand = b.thousand) AND (thousand < 100)
has some correlation. I can't think of a solution for this. and fixing
this issue is beyond the scope of this patch.

So at this stage, I think the row estimation issue is gone.

As the new generated equals increase the execution cost opinion, I think
it is hard for planners to distinguish which quals deserves adding or not.
Instead
I just removed the quals execution during create_plan stage to remove the
obviously
duplicated qual executions. I only handled the case that the derived quals
is executed
at the same time with the restrinctInfo who's parent_ec is used to generate
the
derived quals. If I understand the RestrictInfo.parent_ec correctly, The
cost of
finding out the correlated quals in this patch are pretty low, see
is_correlated_derived_clause.
This is what commit 4 does. After we apply it, we can see the last demo
above becomes to:

regression=# explain analyze select * from tenk1 a join d_tenk2 b on
a.thousand = b.thousand and a.thousand < 100;

QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=10000000000.30..10000002799.78 rows=20020 width=488)
(actual time=0.051..26.080 rows=20000 loops=1)
-> Seq Scan on tenk1 a (cost=10000000000.00..10000000470.00 rows=1001
width=244) (actual time=0.018..3.902 rows=1000 loops=1)
Filter: (thousand < 100)
Rows Removed by Filter: 9000
-> Memoize (cost=0.30..3.18 rows=20 width=244) (actual
time=0.002..0.008 rows=20 loops=1000)
Cache Key: a.thousand
Cache Mode: logical
Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage:
546kB
-> Index Scan using d_tenk2_thousand_idx on d_tenk2 b
(cost=0.29..3.17 rows=20 width=244) (actual time=0.008..0.037 rows=20
loops=100)
Index Cond: (thousand = a.thousand)
Planning Time: 0.596 ms
Execution Time: 27.502 ms
(12 rows)

The "thousand < 100" for b is removed during execution.

Commit 5 reduced the requirements for this path to work. Now it
supports ScalarArrayOpExpr
and any perudoconstant filter to support the user case I meet. Commit 6
added some testcase
and they are just used for review since there are two many runtime
statistics in the output and
I can't think of way to fix it.

I also study David's commit 1, and the semantics of ec_filters is so
accurate and I'm very
excited to see it.

This patch series is still in the PoC stage, so something is not handled at
all. For commit 2, I didn't
handle extended statistics related paths and I just handled plain rel
(subquery, forign table and so
on are missed). I think it is OK for a PoC.

At last, I will share some performance testing for this patch. This is the
real user case I met.

create table p (a int, b int) partition by range(a);
select 'create table p_' || i || ' partition of p for values from (' ||
(i-1) * 100000 || ') to (' || i * 100000 || ');' from generate_series(1,
50)i; \gexec
insert into p select i, i from generate_series(1, 50 * 100000 -1) i;
create index on p(a);

create table q (a int, b int) partition by range(a);
select 'create table q_' || i || ' partition of q for values from (' ||
(i-1) * 100000 || ') to (' || i * 100000 || ');' from generate_series(1,
50)i; \gexec
insert into q select * from p;
create index on q(a);

select * from p, q where p.a = q.a and p.a in (3, 200000);

Run the above query in both prepared and no prepared case, I get the
following results:

| workload | with this feature | w/o this feature |
|--------------+-------------------+------------------|
| Prepared | 0.25 ms | 0.8 ms |
| Non Prepared | 0.890 ms | 4.207 ms |

Any thoughts?

--
Best Regards
Andy Fan

Attachment Content-Type Size
v1-0003-introduce-RelOptInfo.filtered_rows.patch application/x-patch 3.1 KB
v1-0001-Rebaee-David-s-patch-against-the-latest-code.patch application/x-patch 23.3 KB
v1-0005-Support-ScalarArrayOpExpr-and-perudoconstant-on-e.patch application/x-patch 8.6 KB
v1-0004-remove-duplicated-qual-executing.patch application/x-patch 4.6 KB
v1-0002-support-set_xxx_size-with-derived_clauses-ignored.patch application/x-patch 11.9 KB
v1-0006-adding-some-test-cases-for-this-feature-and-fix-t.patch application/x-patch 29.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-02-01 15:20:14 Database-level collation version tracking
Previous Message Fujii Masao 2022-02-01 14:51:54 Re: [Proposal] Add foreign-server health checks infrastructure