Quick Links

Re: Extend more usecase for planning time partition pruning and init partition pruning.

From:	Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Extend more usecase for planning time partition pruning and init partition pruning.
Date:	2021-02-08 07:43:47
Message-ID:	CAKU4AWqeHe=ZudHvp=QikBujzfDrhZ8=Azkh4KTmeExFmcbcNA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jan 25, 2021 at 10:21 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

>
>
> On Sun, Jan 24, 2021 at 6:34 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
>> Hi:
>>
>> I recently found a use case like this. SELECT * FROM p, q WHERE
>> p.partkey =
>> q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either
>> planning time
>> partition prune or init partition prune. Even though we have run-time
>> partition pruning work at last, it is too late in some cases since we
>> have
>> to init all the plan nodes in advance. In my case, there are 10+
>> partitioned relation in one query and the execution time is short, so the
>> init plan a lot of plan nodes cares a lot.
>>
>> The attached patches fix this issue. It just get the "p.partkey = q.colx"
>> case in root->eq_classes or rel->joinlist (outer join), and then check if
>> there
>> is some baserestrictinfo in another relation which can be used for
>> partition
>> pruning. To make the things easier, both partkey and colx must be Var
>> expression in implementation.
>>
>> - v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch
>>
>> Just some existing refactoring and extending ChangeVarNodes to be able
>> to change var->attno.
>>
>> - v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
>>
>> Do the real job.
>>
>> Thought?
>>
>>
>>
>> --
>> Best Regards
>> Andy Fan (https://www.aliyun.com/)
>>
>
>
> Some results from this patch.
>
> create table p (a int, b int, c character varying(8)) partition by list(c);
> create table p1 partition of p for values in ('000001');
> create table p2 partition of p for values in ('000002');
> create table p3 partition of p for values in ('000003');
> create table q (a int, c character varying(8), b int) partition by list(c);
> create table q1 partition of q for values in ('000001');
> create table q2 partition of q for values in ('000002');
> create table q3 partition of q for values in ('000003');
>
> Before the patch:
> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
> and q.c > '000002';
> QUERY PLAN
> ----------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Append
> -> Seq Scan on p1 p_1
> -> Seq Scan on p2 p_2
> -> Seq Scan on p3 p_3
> -> Hash
> -> Seq Scan on q3 q
> Filter: ((c)::text > '000002'::text)
> (9 rows)
>
> After the patch:
>
> QUERY PLAN
> ----------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Seq Scan on p3 p
> -> Hash
> -> Seq Scan on q3 q
> Filter: ((c)::text > '000002'::text)
> (6 rows)
>
>
> Before the patch:
> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
> and (q.c = '000002' or q.c = '000001');
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Append
> -> Seq Scan on p1 p_1
> -> Seq Scan on p2 p_2
> -> Seq Scan on p3 p_3
> -> Hash
> -> Append
> -> Seq Scan on q1 q_1
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> -> Seq Scan on q2 q_2
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> (12 rows)
>
> After the patch:
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Append
> -> Seq Scan on p1 p_1
> -> Seq Scan on p2 p_2
> -> Hash
> -> Append
> -> Seq Scan on q1 q_1
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> -> Seq Scan on q2 q_2
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> (11 rows)
>
> Before the patch:
> postgres=# explain (costs off) select * from p left join q on p.c = q.c
> where (q.c = '000002' or q.c = '000001');
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Append
> -> Seq Scan on p1 p_1
> -> Seq Scan on p2 p_2
> -> Seq Scan on p3 p_3
> -> Hash
> -> Append
> -> Seq Scan on q1 q_1
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> -> Seq Scan on q2 q_2
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> (12 rows)
>
> After the patch:
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------
> Hash Join
> Hash Cond: ((p.c)::text = (q.c)::text)
> -> Append
> -> Seq Scan on p1 p_1
> -> Seq Scan on p2 p_2
> -> Hash
> -> Append
> -> Seq Scan on q1 q_1
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> -> Seq Scan on q2 q_2
> Filter: (((c)::text = '000002'::text) OR ((c)::text =
> '000001'::text))
> (11 rows)
>
> --
> Best Regards
> Andy Fan (https://www.aliyun.com/)
>

Here is a performance test regarding this patch. In the following simple
case,
we can get 3x faster than before.

create table p (a int, b int, c int) partition by list(c);
select 'create table p_'||i||' partition of p for values in (' || i || ');'
from generate_series(1, 100)i; \gexec
insert into p select i, i, i from generate_series(1, 100)i;
create table m as select * from p;
analyze m;
analyze p;

test sql: select * from m, p where m.c = p.c and m.c in (3, 10);

With this patch: 1.1ms
Without this patch: 3.4ms

I'm happy with the result and the implementation, I have add this into
commitfest https://commitfest.postgresql.org/32/2975/

Thanks.

--
Best Regards
Andy Fan (https://www.aliyun.com/)

In response to

Re: Extend more usecase for planning time partition pruning and init partition pruning. at 2021-01-25 02:21:12 from Andy Fan

Responses

Re: Extend more usecase for planning time partition pruning and init partition pruning. at 2021-02-19 10:03:10 from Andy Fan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kyotaro Horiguchi	2021-02-08 08:02:12	Re: Support tab completion for upper character inputs in psql
Previous Message	Michael Paquier	2021-02-08 07:35:19	Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM