Re: A problem about partitionwise join

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, ashutosh(dot)bapat(at)enterprisedb(dot)com
Subject: Re: A problem about partitionwise join
Date: 2024-05-08 09:01:51
Message-ID: CAMbWs48sQ78BoQ4JkY=FHLYtjiiD-5cK2QtQG8WkkRb8y6ZOLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 3, 2024 at 9:31 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, May 3, 2024 at 7:47 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > I think one concern regarding performance cost is that the function
> > exprs_known_equal() would be called O(N^2) times, where N is the number
> > of partition key expressions. But I think this might not be a problem.
> > The number of a joinrel's partition key expressions would only be equal
> > to the join degree, since each base relation within the join contributes
> > only one partition key expression, according to
> > set_joinrel_partition_key_exprs(). This number would not scale with the
> > number of partitions. But I have not measured the performance in
> > practice by running benchmarks. Maybe I'm just wrong.
>
> I don't know, but I do think you should do some benchmarking and see
> if you can find cases where this regresses performance. In my opinion,
> this feature is worth having only if it's basically free. There's lots
> of things we could do in the planner that would give better (perhaps
> much better) plans in certain cases, but which we don't do because in
> all other cases we'd pay a certain number of CPU cycles to have them
> and it just doesn't make sense given how often we'd actually get a
> benefit. This might be another such case.

Thank you for the suggestion. In order to obtain a rough estimation of
how this patch affects planning time, I did the following benchmarking:

* create a partitioned table with 3 keys and 1000 partitions, which
looks like

Partitioned table "public.t1_parted"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | integer | | |
c | integer | | |
d | integer | | |
Partition key: RANGE (a, b, c)
Number of partitions: 1000 (Use \d+ to list them.)

* compose a query involving 5-way joins of this partitioned table, which
looks like:

select * from t1_parted t1
natural join t1_parted t2
natural join t1_parted t3
natural join t1_parted t4
natural join t1_parted t5
where t1.b = 1 and t1.c = 2;

This query is composed in such a way that it could actually generate
partitionwise join, because there exist equi-join condition for each
pair of matching partition keys; but currently on master it is not able
to generate partitionwise join, because of the filters 't1.b = 1 and
t1.c = 2', which is the issue fixed by this patch.

* run this query 5 times with enable_partitionwise_join set to on, and
collect the average planning time on master and on patched.

To ensure fairness, on master, a little hack is required to enable the
generation of partitionwise join for this query. This allows us to
eliminate any potential impact on planning partitionwise joins and
evaluate the effects of this patch accurately.

Below is what I got on my local machine.

-- on master

measurement | average | maximum | minimum | std_dev |
std_dev_as_perc_of_avg
---------------+----------+----------+----------+---------+------------------------
planning time | 30355.07 | 33148.47 | 29020.82 | 1681.23 | 5.54%

-- on patched

measurement | average | maximum | minimum | std_dev |
std_dev_as_perc_of_avg
---------------+----------+----------+----------+---------+------------------------
planning time | 30600.00 | 33523.23 | 28680.75 | 1861.90 | 6.08%

-- without partitionwise join

measurement | average | maximum | minimum | std_dev |
std_dev_as_perc_of_avg
---------------+---------+---------+---------+---------+------------------------
planning time | 4840.18 | 5184.05 | 4528.87 | 299.98 | 6.20%

So it seems that the planning time is not significantly affected by this
patch, particularly when compared to the impact caused by partitionwise
join.

BTW, I was using Ashutosh's script [1] for setting up the benchmarking.
I find the script very handy.

[1]
https://www.postgresql.org/message-id/flat/CAExHW5s%3DbCLMMq8n_bN6iU%2BPjau0DS3z_6Dn6iLE69ESmsPMJQ%40mail.gmail.com

Thanks
Richard

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-05-08 09:20:53 Re: Synchronizing slots from primary to standby
Previous Message Hayato Kuroda (Fujitsu) 2024-05-08 08:26:42 RE: Slow catchup of 2PC (twophase) transactions on replica in LR