Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-02-27 03:59:44
Message-ID: dd206a78-a314-ad52-5d95-d2669427c841@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/02/27 3:27, Robert Haas wrote:
> On Sun, Feb 25, 2018 at 11:10 PM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> I think I'm convinced that partopcintype OIDs can be used where I thought
>> parttypid ones were necessary. The pruning patch uses the respective OID
>> from this array when extracting the datum from an OpExpr to be compared
>> with the partition bound datums. It's sensible, I now think, to require
>> the extracted datum to be of partition opclass declared input type, rather
>> than the type of the partition key involved. So, I removed the parttypid
>> that I'd added to PartitionSchemeData.
>>
>> Updated the comments to make clear the distinction between and purpose of
>> having both parttypcoll and partcollation. Also expanded the comment
>> about partsupfunc a bit.
>
> I don't think this fundamentally fixes the problem, although it does
> narrow it. By requiring partcollation to match across every relation
> with the same PartitionScheme, you're making partition-wise join fail
> to work in some cases where it previously did. Construct a test case
> where parttypcoll matches and partcollation doesn't; then, without the
> patch, the two relations will have the same PartitionScheme and thus
> be eligible for a partition-wise join, but with the patch, they will
> have different PartitionSchemes and thus won't.

I may be confused but shouldn't two tables partitioned on the same column
(of the same collatable type), but using different collations for
partitioning should end up with different PartitionSchemes? Different
partitioning collations would mean that same data may end up in different
partitions of the respective tables.

create table p (a text) partition by range (a collate "en_US");
create table p1 partition of p for values from ('a') to ('m');
create table p2 partition of p for values from ('m') to ('z ');

create table q (a text) partition by range (a collate "C");
create table q1 partition of q for values from ('a') to ('m');
create table q2 partition of q for values from ('m') to ('z ');

insert into p values ('A');
INSERT 0 1

insert into q values ('A');
ERROR: no partition of relation "q" found for row
DETAIL: Partition key of the failing row contains (a) = (A).

You may say that partition bounds might have to be different too in this
case and hence partition-wise join won't occur anyway, but I'm wondering
if the mismatch of partcollation itself isn't enough to conclude that?

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2018-02-27 05:15:29 RE: [bug fix] Cascaded standby cannot start after a clean shutdown
Previous Message Tatsuo Ishii 2018-02-27 03:27:29 Re: TODO item for broken \s with libedit seems fixed