A problem about partitionwise join

From: Richard Guo <riguo(at)pivotal(dot)io>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: A problem about partitionwise join
Date: 2019-08-26 09:32:48
Message-ID: CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

To generate partitionwise join, we need to make sure there exists an
equi-join condition for each pair of partition keys, which is performed
by have_partkey_equi_join(). This makes sense and works well.

But if, let's say, one certain pair of partition keys (foo.k = bar.k)
has formed an equivalence class containing consts, no join clause would
be generated for it, since we have already generated 'foo.k = const' and
'bar.k = const' and pushed them into the proper restrictions earlier.

This will make partitionwise join fail to be planned if there are
multiple partition keys and the pushed-down restrictions 'xxx = const'
fail to prune away any partitions.

Consider the examples below:

create table p (k1 int, k2 int, val int) partition by range(k1,k2);
create table p_1 partition of p for values from (1,1) to (10,100);
create table p_2 partition of p for values from (10,100) to (20,200);

If we are joining on each pair of partition keys, we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2;
QUERY PLAN
----------------------------------------------------------------------
Append
-> Hash Join
Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
-> Hash Join
Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
(11 rows)

But if we add another qual 'foo.k2 = const', we will be unable to
generate partitionwise join any more, because have_partkey_equi_join()
thinks not every partition key has an equi-join condition.

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2
and foo.k2 = 16;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k1 = bar.k1)
-> Append
-> Seq Scan on p_1 foo
Filter: (k2 = 16)
-> Seq Scan on p_2 foo_1
Filter: (k2 = 16)
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (k2 = 16)
-> Seq Scan on p_2 bar_1
Filter: (k2 = 16)
(13 rows)

Is this a problem?

Thanks
Richard

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Darafei Komяpa Praliaskouski 2019-08-26 10:27:27 Re: Yet another fast GiST build
Previous Message Rafia Sabih 2019-08-26 09:31:28 Re: Creating partitions automatically at least on HASH?