Make Append Cost aware of some run time partition prune case

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Make Append Cost aware of some run time partition prune case
Date: 2020-11-10 00:43:59
Message-ID: CAKU4AWp6Z087YCQj4uN69Ct61arCtcABYNgU_7Mv1Gu-grVjHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Currently the cost model of append path sums the cost/rows for all the
subpaths, it usually works well until we run into the run-time partition
prune
case. The first result is that generic plans will rarely be used for some
cases.
For instance, SELECT * FROM p WHERE pkey = $1; The custom plan will only
count the cost of one partition, however generic plan will count the cost
for all the
partitions even we are sure that only 1 partition will survive. Another
impact
is that planners may choose a wrong plan. for example, SELECT * FROM t1,
p
WHERE t1.a = p.pkey; The cost/rows of t1 nest loop p is estimated highly
improperly. This patch wants to help this case to some extent.

The basic idea is we need to estimate how many partitions will survive after
considering the runtime partition prune. After that, we will adjust the
cost/rows accordingly. IIUC, this follows Robert's idea at [1].
However there are too many special cases which is hard
to handle, but luckily the most common case would be sth like partykey =
$1,
which we can estimate well, this patch is aimed to handle that part
specially.
I supposed 75% partitions will survive for other cases arbitrarily,
actually I
want to use 100% to be the same as the current situation. If we decide to
handle their special case differently, PartitionedRelQual has to be
redefined
a bit (see comments around that.) which is the main data struct in the
implementation.

The attached is the minimum runnable patch. There are some obvious issue
here:

1. MergeAppend path is not handled on purpose, it will be done at last.
2. The cost for Parallel Append Path is not adjusted, I'm not sure about
what is
the best way to do it. So there are 2 test cases in
sql/partition_prune.out
failed due to this, I just echo the 'results' to expected' so that you
can
know which one is impacted.

Here are some simple test results.

create table a1 partition of a for values in (1) partition by range (b, c);
create table a1_1 partition of a1 for values from (1, 0) to (1, 10);
create table a1_2 partition of a1 for values from (2, 0) to (2, 10);

create table a2 partition of a for values in (2) partition by range (b, c);
create table a2_1 partition of a2 for values from (1, 0) to (1, 10);
create table a2_2 partition of a2 for values from (2, 0) to (2, 10);

insert into a select 1, i%2 + 1, i % 10 from generate_series(1, 10000) i;
insert into a select 2, i%2 + 1, i % 10 from generate_series(1, 10000) i;

analyze a;

set plan_cache_mode to force_generic_plan;

prepare s as select * from a where a = $1;
PREPARE
explain execute s(1);
QUERY PLAN
-------------------------------------------------------------------
Append (cost=0.00..231.00 rows=10000 width=12) *<< both rows/cost is
adjusted.*
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
(6 rows)

prepare s4 as select * from a where a = $1 and b = $2 and c = $3;
PREPARE
explain execute s4(1, 1, 2);
QUERY PLAN
--------------------------------------------------------------------
Append (cost=0.00..120.50 rows=1000 width=12) *<< Here*
Subplans Removed: 3
-> Seq Scan on a1_1 a_1 (cost=0.00..115.50 rows=1000 width=12)
Filter: ((a = $1) AND (b = $2) AND (c = $3))
(4 rows)

prepare s2 as select * from a where a = $1 union all select * from a where
a = $2;
PREPARE
explain execute s2(1, 1);
QUERY PLAN
-------------------------------------------------------------------------
Append (cost=0.00..762.00 rows=20000 width=12)
-> Append (cost=0.00..231.00 rows=10000 width=12) *<< Here*
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $1)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_4 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
-> Seq Scan on a1_2 a_5 (cost=0.00..90.50 rows=5000 width=12)
Filter: (a = $2)
(13 rows)

prepare s3 as select * from a where a = $1 union select * from a where a =
$2;
PREPARE
explain execute s3(1, 1);
QUERY PLAN
-------------------------------------------------------------------------------
HashAggregate (cost=912.00..1112.00 rows=20000 width=12)
Group Key: a.a, a.b, a.c
-> Append (cost=0.00..762.00 rows=20000 width=12)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_1 (cost=0.00..90.50 rows=5000
width=12)
Filter: (a = $1)
-> Seq Scan on a1_2 a_2 (cost=0.00..90.50 rows=5000
width=12)
Filter: (a = $1)
-> Append (cost=0.00..231.00 rows=10000 width=12) << Here
Subplans Removed: 2
-> Seq Scan on a1_1 a_4 (cost=0.00..90.50 rows=5000
width=12)
Filter: (a = $2)
-> Seq Scan on a1_2 a_5 (cost=0.00..90.50 rows=5000
width=12)
Filter: (a = $2)
(15 rows)

-- add a limit to make sure the runtime partitions prune.
explain select * from generate_series(1, 10) i(i) join lateral (select *
from a
where a.a = (i.i % 2 + 1) and a.b = (i.i % 2 + 1) and a.c = (i.i % 10)
limit 10000000) m on true;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2030.10 rows=10000 width=16)
-> Function Scan on generate_series i (cost=0.00..0.10 rows=10 width=4)
-> Limit (cost=0.00..183.00 rows=1000 width=12)
-> Append (cost=0.00..183.00 rows=1000 width=12) << Here
-> Seq Scan on a1_1 a_1 (cost=0.00..178.00 rows=1000
width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1))
AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a1_2 a_2 (cost=0.00..178.00 rows=1000
width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1))
AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a2_1 a_3 (cost=0.00..178.00 rows=1000
width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1))
AND (b = ((i.i % 2) + 1)))
-> Seq Scan on a2_2 a_4 (cost=0.00..178.00 rows=1000
width=12)
Filter: ((c = (i.i % 10)) AND (a = ((i.i % 2) + 1))
AND (b = ((i.i % 2) + 1)))
(12 rows)

Any thoughts about this? Thanks

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZHYoAL4HYwnGO25B8CxCB%2BvNMdf%2B7rbUzYykR4sU9yUA%40mail.gmail.com

--
Best Regards
Andy Fan

Attachment Content-Type Size
v1-0001-Adjust-Append-Path-cost-model-for-runtime-partiti.patch application/octet-stream 25.0 KB
part_runtime_prune.sql application/octet-stream 1.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2020-11-10 00:45:50 RE: POC: postgres_fdw insert batching
Previous Message Tomas Vondra 2020-11-10 00:13:22 Re: MultiXact\SLRU buffers configuration