Re: why partition pruning doesn't work?

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: why partition pruning doesn't work?
Date: 2018-06-07 02:51:09
Message-ID: 8d1da0f7-0cfa-307c-ea03-92b618f60c5e@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/06/06 18:52, David Rowley wrote:
> On 6 June 2018 at 18:05, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2018/06/06 14:10, David Rowley wrote:
>>> I then decided that
>>> I didn't like the way we need to check which params are in the Expr
>>> each time we call partkey_datum_from_expr. It seems better to prepare
>>> this in advance when building the pruning steps. I started work on
>>> that, but soon realised that I'd need to pass a List of Bitmapsets to
>>> the executor. This is a problem as Bitmapset is not a Node type and
>>> cannot be copied with COPY_NODE_FIELD(). Probably this could be
>>> refactored to instead of passing 3 Lists in the PartitionPruneStepOp
>>> we could invent a new node type that just has 3 fields and store a
>>> single List.
>>
>> I wonder why we need to create those Bitmapsets in the planner? Why not
>> in ExecSetupPartitionPruneState()? For example, like how
>> context->exprstates is initialized.
>
> That seems like a good idea. Certainly much better than working them
> out each time we prune.
>
> v3 patch attached.

Thanks David. This one looks good. I also like it that hasparamlessexprs
is no longer determined and set in the planner.

I checked what happens with the cases that Ashutosh complained about
upthread and seems that the pruning works as expected.

create table t1 (a int, b int) partition by range (a);
create table t1p1 partition of t1 for values from (0) to (100);
create table t1p2 partition of t1 for values from (100) to (200);
create index on t1 (a);
insert into t1 select i, i from generate_series(0, 199) i;

explain (costs off, analyze) select * from t1 x left join t1 y on x.a =
y.b + 100 where y.a = 5;
QUERY PLAN

-----------------------------------------------------------------------------------------------
Nested Loop (actual time=0.294..0.371 rows=1 loops=1)
-> Append (actual time=0.067..0.092 rows=1 loops=1)
-> Bitmap Heap Scan on t1p1 y (actual time=0.049..0.059 rows=1
loops=1)
Recheck Cond: (a = 5)
Heap Blocks: exact=1
-> Bitmap Index Scan on t1p1_a_idx (actual
time=0.022..0.022 rows=1 loops=1)
Index Cond: (a = 5)
-> Append (actual time=0.192..0.219 rows=1 loops=1)
-> Index Scan using t1p1_a_idx on t1p1 x (never executed)
Index Cond: (a = (y.b + 100))
-> Index Scan using t1p2_a_idx on t1p2 x_1 (actual
time=0.134..0.145 rows=1 loops=1)
Index Cond: (a = (y.b + 100))
Planning Time: 5.314 ms
Execution Time: 0.938 ms
(14 rows)

Note that the condition x.a = y.b + 100 is able to prune t1p1, whereas on
HEAD it isn't.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonathan S. Katz 2018-06-07 03:08:03 Re: commitfest 2018-07
Previous Message Craig Ringer 2018-06-07 02:22:21 Re: libpq compression