Re: Partition-wise join for join between (declaratively) partitioned tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-04-26 16:00:50
Message-ID: CA+Tgmoa5Z-EiaBApsERXLV=sNCQLmL-RoGUFsYbUhQcRXH0rgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 24, 2017 at 7:06 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> This assumes that datums in partition bounds have one to one mapping
> with the partitions, which isn't true for list partitions. For list
> partitions we have multiple datums corresponding to the items listed
> associated with a given partition. So, simply looping over the
> partitions of outer relations doesn't work; in fact there are two
> outer relations for a full outer join, so we have to loop over both of
> them together in a merge join fashion.

Maybe so, but my point is that it can be done with the original types,
without converting anything to a different type.

> When using clause is used the columns specified by using clause from
> the joining relations are merged into a single column. Here it has
> used a "wider" type column t2.a as the merged column for t1.a and
> t2.a. The logic is in buildMergedJoinVar().

That relies on select_common_type(), which can error out if it can't
find a common type. That's OK for the current uses of that function,
because if it fails it means that the query is invalid. But it's not
OK for what you want here, because it's not OK to error out due to
inability to do a partition-wise join when a non-partition-wise join
would have worked. Also, note that all select_common_type() is really
doing is looking for the type within the type category that is marked
typispreferred, or else checking which direction has an implicit cast.
Neither of those things guarantee the property you want here, namely
that the "common" type is in the same opfamily and can store every
value of any of the input types without loss of precision. So I don't
think you can rely on that.

I'm going to say this one more time: I really, really, really think
you need to avoid trying to convert the partition bounds to a common
type. I said before that the infrastructure to do that is not present
in our type system, and I'm pretty sure that statement is 100%
correct. The fact that you can find other cases where we do something
sorta like that but in a different case with different requirements
doesn't make that false.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-04-26 16:10:40 Re: logical replication fixes
Previous Message Fujii Masao 2017-04-26 15:51:03 Re: subscription worker doesn't start immediately on eabled