Re: partition pruning doesn't work with IS NULL clause in multikey range partition case

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: partition pruning doesn't work with IS NULL clause in multikey range partition case
Date: 2018-07-11 10:50:37
Message-ID: CAFiTN-vNtLh_n7Dw7GfEidjQ78ThapgOUmof9uct=TwEb_XNAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 11, 2018 at 3:06 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> Hi,
> Consider following test case.
> create table prt (a int, b int, c int) partition by range(a, b);
> create table prt_p1 partition of prt for values (0, 0) to (100, 100);
> create table prt_p1 partition of prt for values from (0, 0) to (100, 100);
> create table prt_p2 partition of prt for values from (100, 100) to (200, 200);
> create table prt_def partition of prt default;
>
> In a range partitioned table, a row with any partition key NULL goes
> to the default partition if it exists.
> insert into prt values (null, 1);
> insert into prt values (1, null);
> insert into prt values (null, null);
> select tableoid::regclass, * from prt;
> tableoid | a | b | c
> ----------+---+---+---
> prt_def | | 1 |
> prt_def | 1 | |
> prt_def | | |
> (3 rows)
>
> There's a comment in get_partition_for_tuple(), which says so.
> /*
> * No range includes NULL, so this will be accepted by the
> * default partition if there is one, and otherwise rejected.
> */
>
> But when there is IS NULL clause on any of the partition keys with
> some condition on other partition key, all the partitions scanned. I
> expected pruning to prune all the partitions except the default one.
>
> explain verbose select * from prt where a is null and b = 100;
> QUERY PLAN
> ----------------------------------------------------------------------
> Append (cost=0.00..106.52 rows=3 width=12)
> -> Seq Scan on public.prt_p1 (cost=0.00..35.50 rows=1 width=12)
> Output: prt_p1.a, prt_p1.b, prt_p1.c
> Filter: ((prt_p1.a IS NULL) AND (prt_p1.b = 100))
> -> Seq Scan on public.prt_p2 (cost=0.00..35.50 rows=1 width=12)
> Output: prt_p2.a, prt_p2.b, prt_p2.c
> Filter: ((prt_p2.a IS NULL) AND (prt_p2.b = 100))
> -> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
> Output: prt_def.a, prt_def.b, prt_def.c
> Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
> (10 rows)
>
> I thought that the following code in get_matching_range_bounds()
> /*
> * If there are no datums to compare keys with, or if we got an IS NULL
> * clause just return the default partition, if it exists.
> */
> if (boundinfo->ndatums == 0 || !bms_is_empty(nullkeys))
> {
> result->scan_default = partition_bound_has_default(boundinfo);
> return result;
> }
>
> would do the trick but through the debugger I saw that nullkeys is
> NULL for this query.
>
> I didn't investigate further to see why nullkeys is NULL, but it looks
> like that's the problem and we are missing an optimization.

I think the problem is that the gen_partprune_steps_internal expect
that all the keys should match to IS NULL clause, only in such case
the "partition pruning step" will store the nullkeys.

After a small change, it is able to prune the partitions.

--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -857,7 +857,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* If generate_opsteps is set to false it means no OpExprs were directly
* present in the input list.
*/
- if (!generate_opsteps)
+ if (nullkeys || !generate_opsteps)
{
/*
* Generate one prune step for the information derived
from IS NULL,
@@ -865,8 +865,7 @@
gen_partprune_steps_internal(GeneratePruningStepsContext *context,
* clauses for all partition keys.
*/
if (!bms_is_empty(nullkeys) &&
- (part_scheme->strategy != PARTITION_STRATEGY_HASH ||
- bms_num_members(nullkeys) == part_scheme->partnatts))
+ (part_scheme->strategy != PARTITION_STRATEGY_HASH))
{
PartitionPruneStep *step;

postgres=# explain verbose select * from prt where a is null and b = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.00..35.51 rows=1 width=12)
-> Seq Scan on public.prt_def (cost=0.00..35.50 rows=1 width=12)
Output: prt_def.a, prt_def.b, prt_def.c
Filter: ((prt_def.a IS NULL) AND (prt_def.b = 100))
(4 rows)

Above fix is just to show the root cause of the issue, I haven't
investigated that what should be the exact fix for this issue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-07-11 10:52:09 Re: Jsonb transform for pl/python
Previous Message Andrew Gierth 2018-07-11 10:38:46 Re: Problem with tupdesc in jsonb_to_recordset