Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Yeb Havinga <yebhavinga(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))
Date: 2011-07-27 14:16:21
Message-ID: CA+TgmoZaXoJdofV+maCquWOc2kkYgQbWNUbCASgdMZNocCcROQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 26, 2011 at 5:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yeb Havinga <yebhavinga(at)gmail(dot)com> writes:
>> A few days ago I read Tomas Vondra's blog post about dss tpc-h queries
>> on PostgreSQL at
>> http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in
>> which he showed how to manually pull up a dss subquery to get a large
>> speed up. Initially I thought: cool, this is probably now handled by
>> Hitoshi's patch, but it turns out the subquery type in the dss query is
>> different.
>
> Actually, I believe this example is the exact opposite of the
> transformation Hitoshi proposes.  Tomas was manually replacing an
> aggregated subquery by a reference to a grouped table, which can be
> a win if the subquery would be executed enough times to amortize
> calculation of the grouped table over all the groups (some of which
> might never be demanded by the outer query).  Hitoshi was talking about
> avoiding calculations of grouped-table elements that we don't need,
> which would be a win in different cases.  Or at least that was the
> thrust of his original proposal; I'm not sure where the patch went since
> then.
>
> This leads me to think that we need to represent both cases as the same
> sort of query and make a cost-based decision as to which way to go.
> Thinking of it as a pull-up or push-down transformation is the wrong
> approach because those sorts of transformations are done too early to
> be able to use cost comparisons.

I think you're right. OTOH, our estimates of what will pop out of an
aggregate are so poor that denying the user to control the plan on the
basis of how they write the query might be a net negative. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-07-27 14:18:35 Re: XMLATTRIBUTES vs. values of type XML
Previous Message Peter Eisentraut 2011-07-27 14:14:58 Re: [COMMITTERS] pgsql: Add missing newlines at end of error messages