Re: Extend more usecase for planning time partition pruning and init partition pruning.

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Extend more usecase for planning time partition pruning and init partition pruning.
Date: 2021-02-21 13:33:38
Message-ID: CAKU4AWpZ+Ln4sMjgOTOF00XtMiOnXAfDasKTdbxzfZMdUpKwpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 19, 2021 at 6:03 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

>
>
> On Mon, Feb 8, 2021 at 3:43 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
>>
>>
>> On Mon, Jan 25, 2021 at 10:21 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>
>>> On Sun, Jan 24, 2021 at 6:34 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
>>> wrote:
>>>
>>>> Hi:
>>>>
>>>> I recently found a use case like this. SELECT * FROM p, q WHERE
>>>> p.partkey =
>>>> q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either
>>>> planning time
>>>> partition prune or init partition prune. Even though we have run-time
>>>> partition pruning work at last, it is too late in some cases since we
>>>> have
>>>> to init all the plan nodes in advance. In my case, there are 10+
>>>> partitioned relation in one query and the execution time is short, so
>>>> the
>>>> init plan a lot of plan nodes cares a lot.
>>>>
>>>> The attached patches fix this issue. It just get the "p.partkey =
>>>> q.colx"
>>>> case in root->eq_classes or rel->joinlist (outer join), and then check
>>>> if there
>>>> is some baserestrictinfo in another relation which can be used for
>>>> partition
>>>> pruning. To make the things easier, both partkey and colx must be Var
>>>> expression in implementation.
>>>>
>>>> - v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch
>>>>
>>>> Just some existing refactoring and extending ChangeVarNodes to be able
>>>> to change var->attno.
>>>>
>>>> - v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
>>>>
>>>> Do the real job.
>>>>
>>>> Thought?
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>> Andy Fan (https://www.aliyun.com/)
>>>>
>>>
>>>
>>> Some results from this patch.
>>>
>>> create table p (a int, b int, c character varying(8)) partition by
>>> list(c);
>>> create table p1 partition of p for values in ('000001');
>>> create table p2 partition of p for values in ('000002');
>>> create table p3 partition of p for values in ('000003');
>>> create table q (a int, c character varying(8), b int) partition by
>>> list(c);
>>> create table q1 partition of q for values in ('000001');
>>> create table q2 partition of q for values in ('000002');
>>> create table q3 partition of q for values in ('000003');
>>>
>>> Before the patch:
>>> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
>>> and q.c > '000002';
>>> QUERY PLAN
>>> ----------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Append
>>> -> Seq Scan on p1 p_1
>>> -> Seq Scan on p2 p_2
>>> -> Seq Scan on p3 p_3
>>> -> Hash
>>> -> Seq Scan on q3 q
>>> Filter: ((c)::text > '000002'::text)
>>> (9 rows)
>>>
>>> After the patch:
>>>
>>> QUERY PLAN
>>> ----------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Seq Scan on p3 p
>>> -> Hash
>>> -> Seq Scan on q3 q
>>> Filter: ((c)::text > '000002'::text)
>>> (6 rows)
>>>
>>>
>>> Before the patch:
>>> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
>>> and (q.c = '000002' or q.c = '000001');
>>> QUERY PLAN
>>>
>>> --------------------------------------------------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Append
>>> -> Seq Scan on p1 p_1
>>> -> Seq Scan on p2 p_2
>>> -> Seq Scan on p3 p_3
>>> -> Hash
>>> -> Append
>>> -> Seq Scan on q1 q_1
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> -> Seq Scan on q2 q_2
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> (12 rows)
>>>
>>> After the patch:
>>> QUERY PLAN
>>>
>>> --------------------------------------------------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Append
>>> -> Seq Scan on p1 p_1
>>> -> Seq Scan on p2 p_2
>>> -> Hash
>>> -> Append
>>> -> Seq Scan on q1 q_1
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> -> Seq Scan on q2 q_2
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> (11 rows)
>>>
>>> Before the patch:
>>> postgres=# explain (costs off) select * from p left join q on p.c = q.c
>>> where (q.c = '000002' or q.c = '000001');
>>> QUERY PLAN
>>>
>>> --------------------------------------------------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Append
>>> -> Seq Scan on p1 p_1
>>> -> Seq Scan on p2 p_2
>>> -> Seq Scan on p3 p_3
>>> -> Hash
>>> -> Append
>>> -> Seq Scan on q1 q_1
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> -> Seq Scan on q2 q_2
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> (12 rows)
>>>
>>> After the patch:
>>> QUERY PLAN
>>>
>>> --------------------------------------------------------------------------------------------
>>> Hash Join
>>> Hash Cond: ((p.c)::text = (q.c)::text)
>>> -> Append
>>> -> Seq Scan on p1 p_1
>>> -> Seq Scan on p2 p_2
>>> -> Hash
>>> -> Append
>>> -> Seq Scan on q1 q_1
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> -> Seq Scan on q2 q_2
>>> Filter: (((c)::text = '000002'::text) OR ((c)::text
>>> = '000001'::text))
>>> (11 rows)
>>>
>>> --
>>> Best Regards
>>> Andy Fan (https://www.aliyun.com/)
>>>
>>
>>
>> Here is a performance test regarding this patch. In the following simple
>> case,
>> we can get 3x faster than before.
>>
>> create table p (a int, b int, c int) partition by list(c);
>> select 'create table p_'||i||' partition of p for values in (' || i ||
>> ');' from generate_series(1, 100)i; \gexec
>> insert into p select i, i, i from generate_series(1, 100)i;
>> create table m as select * from p;
>> analyze m;
>> analyze p;
>>
>> test sql: select * from m, p where m.c = p.c and m.c in (3, 10);
>>
>> With this patch: 1.1ms
>> Without this patch: 3.4ms
>>
>> I'm happy with the result and the implementation, I have add this into
>> commitfest https://commitfest.postgresql.org/32/2975/
>>
>> Thanks.
>>
>> --
>> Best Regards
>> Andy Fan (https://www.aliyun.com/)
>>
>
> Rebase to the current latest commit 678d0e239b.
>
>
Rebase to the latest commit ea1268f630 .

--
Best Regards
Andy Fan (https://www.aliyun.com/)

Attachment Content-Type Size
v3-0001-Make-some-static-functions-as-extern-and-extend-C.patch application/octet-stream 5.9 KB
v3-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch application/octet-stream 42.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message er 2021-02-21 14:06:03 Re: GROUP BY DISTINCT
Previous Message Vik Fearing 2021-02-21 12:52:24 GROUP BY DISTINCT