Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-03-28 09:29:26
Message-ID: b2b3c1c3-3cea-d0a5-3281-cf23321b0583@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/03/28 12:58, David Rowley wrote:
> On 27 March 2018 at 23:42, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> I have managed to make the recursion go away in the attached updated
>> version. I guess that's the result of employing the idea of a "output
>> register" for individual pruning steps as mentioned in Robert's email
>> upthread where he detailed the "pruning steps" approach [1].
>
> Thanks for making that work. I've only glanced at the patch, and not
> taken enough time to understand how the new parts work yet.
>
> In the meantime, I've attached some fixes for v41 which I previously
> submitted for v40.

Thank you. I've merged it.

Also, I have redesigned how we derive partition indexes after running
pruning steps. Previously, for each step we'd determine the indexes of
"partitions" that are not pruned leading to a list partition not being
pruned sometimes, as shown in the two recent examples. Instead, in the
new approach, we only keep track of the indexes of the "datums" that
satisfy individual pruning steps (both base pruning steps and combine
steps) and only figure out the partition indexes after we've determined
set of datums that survive all pruning steps. That is, after we're done
executing all pruning steps. Whether we need to scan special partitions
like null-only and default partition is tracked along with datum indexes
for each step. With this change, pruning works as expected in both examples:

create table lp (a char) partition by list (a);
create table lp_ad partition of lp for values in ('a', 'd');
create table lp_bc partition of lp for values in ('b', 'c');
create table lp_default partition of lp default;
explain (costs off) select * from lp where a > 'a' and a < 'd';
QUERY PLAN
-----------------------------------------------------------
Append
-> Seq Scan on lp_bc
Filter: ((a > 'a'::bpchar) AND (a < 'd'::bpchar))
-> Seq Scan on lp_default
Filter: ((a > 'a'::bpchar) AND (a < 'd'::bpchar))
(5 rows)

CREATE TABLE listp (a INT) PARTITION BY LIST(a);
CREATE TABLE listp1_3 PARTITION OF listp FOR VALUES IN (1, 3);
EXPLAIN SELECT * FROM listp WHERE a > 1 AND a < 3;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=4)
One-Time Filter: false
(2 rows)

Moreover, with pruning now working at a high-level with datum indexes
instead of partition indexes, pruning for PartitionPruneStepOpNe is
simplified greatly. We simply delete from a bitmapset initially
containing the indexes of all datums in boundinfo the indexes of those
that appear in the query. So:

explain (costs off) select * from lp where a <> 'a' and a <> 'd';
QUERY PLAN
-------------------------------------------------------------
Append
-> Seq Scan on lp_bc
Filter: ((a <> 'a'::bpchar) AND (a <> 'd'::bpchar))
-> Seq Scan on lp_default
Filter: ((a <> 'a'::bpchar) AND (a <> 'd'::bpchar))
(5 rows)

where we delete indexes of 'a' and 'd' from the bitmapset initially
containing indexes of all datums, leaving us with only those of 'b' and
'c'. Also, the default partition is scanned as it would always be for a
PartitionPruneStepOpNe step.

Attached is the updated set of patches, which contains other miscellaneous
changes such as updated comments, beside the main changes described above.

Regards,
Amit

Attachment Content-Type Size
v42-0001-Add-partsupfunc-to-PartitionSchemeData.patch text/plain 3.4 KB
v42-0002-Add-more-tests-for-partition-pruning.patch text/plain 16.3 KB
v42-0003-Faster-partition-pruning.patch text/plain 124.0 KB
v42-0004-Add-only-unpruned-partitioned-child-rels-to-part.patch text/plain 24.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2018-03-28 09:51:43 Re: Oddity in COPY FROM handling of check constraints on partition tables
Previous Message Aleksander Alekseev 2018-03-28 09:21:15 Re: new function for tsquery creartion