Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-21 11:41:25
Message-ID: CAFjFpRedUZPa7tKbCLEGK3u5UWdDNQoN=eYfb7ieG5d0D1PbsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Mon, Mar 20, 2017 at 10:17 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>
>> On a further testing of this patch I find another case when it is
>> showing regression, the time taken with patch is around 160 secs and
>> without it is 125 secs.
>> Another minor thing to note that is planning time is almost twice with
>> this patch, though I understand that this is for scenarios with really
>> big 'big data' so this may not be a serious issue in such cases, but
>> it'd be good if we can keep an eye on this that it doesn't exceed the
>> computational bounds for a really large number of tables.
>
> Right, planning time would be proportional to the number of partitions
> at least in the first version. We may improve upon it later.
>
>> Please find the attached .out file to check the output I witnessed and
>> let me know if anymore information is required
>> Schema and data was similar to the preciously shared schema with the
>> addition of more data for this case, parameter settings used were:
>> work_mem = 1GB
>> random_page_cost = seq_page_cost = 0.1

this doesn't look good. Why do you set both these costs to the same value?

>
> The patch does not introduce any new costing model. It costs the
> partition-wise join as sum of costs of joins between partitions. The
> method to create the paths for joins between partitions is same as
> creating the paths for joins between regular tables and then the
> method to collect paths across partition-wise joins is same as
> collecting paths across child base relations. So, there is a large
> chance that the costing for joins between partitions might have a
> problem which is showing up here. There may be some special handling
> for regular tables versus child tables that may be the root cause. But
> I have not seen that kind of code till now.
>
> Can you please provide the outputs of individual partition-joins? If
> the plans for joins between partitions are same as the ones chosen for
> partition-wise joins, we may need to fix the existing join cost
> models.

Offlist, Rafia shared the outputs of joins between partitions and join
between partitioned table. The joins between partitions look similar
to those pick up by the partition-wise join. So, it looks that some
costing error in regular joins is resulting in an costing error in
partition-wise join as suspected. Attached the SQL and the output.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
queries.out application/octet-stream 110.3 KB
queries.sql application/octet-stream 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-03-21 11:46:00 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Etsuro Fujita 2017-03-21 11:21:11 postgres_fdw: correct regression test for parameterized scan for foreign table