Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Amit Langote <amitlangote09(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-03-30 05:38:44
Message-ID: fc73cef4-6879-26c3-6859-2f910640234a@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi David.

On 2018/03/29 20:08, David Rowley wrote:
> I've looked at the v44 patch.

Thank you.

> Thanks for making those changes.
>
> The new not-equal handling code is not quite right.
>
> DROP TABLE listp;
> CREATE TABLE listp (a INT) PARTITION BY LIST(a);
> CREATE TABLE listp1_3 PARTITION OF listp FOR VALUES IN(1,3);
> CREATE TABLE listp_default PARTITION OF listp DEFAULT;
>
> EXPLAIN SELECT * FROM listp WHERE a <> 1;
> QUERY PLAN
> ------------------------------------------------------------------
> Append (cost=0.00..54.56 rows=2537 width=4)
> -> Seq Scan on listp1_3 (cost=0.00..41.88 rows=2537 width=4)
> Filter: (a <> 1)
> (3 rows)
>
> The default should be included here.
>
> INSERT INTO listp VALUES(1),(2),(3);
> SELECT * FROM listp WHERE a <> 1;
> a
> ---
> 3
> (1 row)

Good catch! Indeed, the default partition should not have been pruned
away in this case.

> This code assumes its fine to just reverse the setting for default:
>
> result->scan_default = !source->scan_default;
>
> More complex handling is needed here.

Hmm, I thought about this and came to a conclusion that we should *always*
scan the default partition in this case. Inversion step removes all the
datums selected by the source step from the set of *all* datums that the
currently defined set of partitions allow. If there's a default partition
in the mix, that means the latter contains all the datums of the partition
key's data type. Irrespective of whether or not the source step selected
the default partition, there would be datums that would be in the set
after inversion which in turn would be in the default partition, if not in
some non-default partition that would've been selected. I have written a
comment there trying to explain this, but I may not have been able to
articulate it properly. Please check. Or does this sound just wrong?

> I've attached a diff for a small set of other things I noticed while
reviewing.

Thanks, merged.

Please find attached the updated patches.

Thanks,
Amit

Attachment Content-Type Size
v45-0001-Add-partsupfunc-to-PartitionSchemeData.patch text/plain 3.4 KB
v45-0002-Add-more-tests-for-partition-pruning.patch text/plain 16.7 KB
v45-0003-Faster-partition-pruning.patch text/plain 123.3 KB
v45-0004-Add-only-unpruned-partitioned-child-rels-to-part.patch text/plain 24.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-03-30 05:56:08 Re: [HACKERS] SERIALIZABLE with parallel query
Previous Message Kyotaro HORIGUCHI 2018-03-30 05:25:32 Re: Question about WalSndWriteData