Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)
Date: 2011-06-30 08:37:43
Message-ID: BANLkTinAFrutO_cd9hsTb_5C5u5kMYcxxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/6/30 Yeb Havinga <yebhavinga(at)gmail(dot)com>:
> On 2011-06-29 19:22, Hitoshi Harada wrote:
>>
>> Other things are all good points. Thanks for elaborate review!
>> More than anything, I'm going to fix the 6) issue, at least to find the
>> cause.
>>
> Some more questions:
> 8) why are cheapest start path and cheapest total path in
> best_inner_subqueryscan the same?

Because best_inner_indexscan has the two. Actually one of them is
enough so far. I aligned it as the existing interface but they might
be one.

> 10) I have a hard time imagining use cases that will actually result in a
> alternative plan, especially since not all subqueries are allowed to have
> quals pushed down into, and like Simon Riggs pointed out that many users
> will write queries like this with the subqueries pulled up. If it is the
> case that the subqueries that can't be pulled up have a large overlap with
> the ones that are not pushdown safe (limit, set operations etc), there might
> be little actual use cases for this patch.

I have seen many cases that this planner hack would help
significantly, which were difficult to rewrite. Why were they
difficult to write? Because, quals on size_m (and they have quals on
size_l in fact) are usually very complicated (5-10 op clauses) and the
join+agg part itself is kind of subquery in other big query. Of course
there were workaround like split the statement to two, filtering
size_m then aggregate size_l by the result of the first statement.
However, it's against instinct. The reason why planner is in RDBMS is
to let users to write simple (as needed) statements. I don't know if
the example I raise here is common or not, but I believe the example
represents "one to many" relation simply, therefore there should be
many users who just don't find themselves currently in the slow query
performance.

> I think the most important thing for this patch to go forward is to have a
> few examples, from which it's clear that the patch is beneficial.

What will be good examples to show benefit of the patch? I guess the
test case of size_m/size_l shows it. What lacks on the case, do you
think?

Regards,

--
Hitoshi Harada

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-06-30 09:00:21 Re: time-delayed standbys
Previous Message Radosław Smogura 2011-06-30 08:33:57 Re: Review of patch Bugfix for XPATH() if expression returns a scalar value