Re: Add a greedy join search algorithm to handle large join problems

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Chengpeng Yan <chengpeng_yan(at)Outlook(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, John Naylor <johncnaylorls(at)gmail(dot)com>
Subject: Re: Add a greedy join search algorithm to handle large join problems
Date: 2025-12-09 23:30:47
Message-ID: cb313155-24c4-4838-a46b-44968993a6e2@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/9/25 20:20, Tomas Vondra wrote:
> On 12/2/25 14:04, Chengpeng Yan wrote:
>> Hi,
>>
>>
>>
>>> On Dec 2, 2025, at 18:56, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>>
>>> I think a much broader evaluation will be needed, comparing not just the
>>> planning time, but also the quality of the final plan. Which for the
>>> starjoin tests does not really matter, as the plans are all equal in
>>> this regard.
>>
>>
>> Many thanks for your feedback.
>>
>> You are absolutely right — plan quality is also very important. In my
>> initial email I only showed the improvements in planning time, but did
>> not provide results regarding plan quality. I will run tests on more
>> complex join scenarios, evaluating both planning time and plan quality.
>>
>
> I was trying to do some simple experiments by comparing plans for TPC-DS
> queries, but unfortunately I get a lot of crashes with the patch. All
> the backtraces look very similar - see the attached example. The root
> cause seems to be that sort_inner_and_outer() sees
>
> inner_path = NULL
>
> I haven't investigated this very much, but I suppose the GOO code should
> be calling set_cheapest() from somewhere.
>

FWIW after looking at the failing queries for a bit, and a bit of
tweaking, it seems the issue is about aggregates in the select list. For
example this TPC-DS query fails (Q7):

select i_item_id,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
from store_sales, customer_demographics, date_dim, item, promotion
where ss_sold_date_sk = d_date_sk and
ss_item_sk = i_item_sk and
ss_cdemo_sk = cd_demo_sk and
ss_promo_sk = p_promo_sk and
cd_gender = 'F' and
cd_marital_status = 'W' and
cd_education_status = 'Primary' and
(p_channel_email = 'N' or p_channel_event = 'N') and
d_year = 1998
group by i_item_id
order by i_item_id
LIMIT 100;

but if I remove the aggregates, it plans just fine:

select i_item_id
from store_sales, customer_demographics, date_dim, item, promotion
where ss_sold_date_sk = d_date_sk and
ss_item_sk = i_item_sk and
ss_cdemo_sk = cd_demo_sk and
ss_promo_sk = p_promo_sk and
cd_gender = 'F' and
cd_marital_status = 'W' and
cd_education_status = 'Primary' and
(p_channel_email = 'N' or p_channel_event = 'N') and
d_year = 1998
group by i_item_id
order by i_item_id
LIMIT 100;

The backtrace matches the one I already posted, I'm not going to post
that again.

I looked at a couple more failing queries, and removing the aggregates
fixes them too. Maybe there are other issues/crashes, of course.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2025-12-09 23:31:56 Re: Add support for specifying tables in pg_createsubscriber.
Previous Message Mark Wong 2025-12-09 23:28:59 updates for handling optional argument in system functions