Re: [PATCH] Lazy hashaggregate when no aggregation is needed

From: "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'Robert Haas'" <robertmhaas(at)gmail(dot)com>
Cc: "'Jay Levitt'" <jay(dot)levitt(at)gmail(dot)com>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>, "'Francois Deliege'" <fdeliege(at)gmail(dot)com>
Subject: Re: [PATCH] Lazy hashaggregate when no aggregation is needed
Date: 2012-06-29 02:22:15
Message-ID: 00aa01cd559d$ffb32790$ff1976b0$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ants,

> -----Original Message-----
> From: Ants Aasma [mailto:ants(at)cybertec(dot)at]
> Sent: Wednesday, June 27, 2012 9:23 PM
> To: Robert Haas
> Cc: Etsuro Fujita; Jay Levitt; Tom Lane; PostgreSQL-development; Francois
> Deliege
> Subject: Re: [HACKERS] [PATCH] Lazy hashaggregate when no aggregation is
needed
>
> Sorry about the delay in answering. I have been swamped with non-PG
> related things lately.
>
> On Tue, Jun 26, 2012 at 11:08 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Fri, Jun 22, 2012 at 10:12 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> On Tue, Jun 19, 2012 at 5:41 AM, Etsuro Fujita
> >> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >>>> I'm confused by this remark, because surely the query planner does it
this
> >>>> way only if there's no LIMIT.  When there is a LIMIT, we choose based on
> >>>> the startup cost plus the estimated fraction of the total cost we expect
> >>>> to pay based on dividing the LIMIT by the overall row count estimate.  Or
> >>>> is this not what you're talking about?
>
> My reasoning was that query_planner returns the cheapest-total path
> and cheapest fractional presorted (by the aggregation pathkeys). When
> evaluating hash-aggregates with this patch these two are indeed
> compared considering the esimated fraction of the total cost, but this
> might miss cheapest fractional unordered path for lazy hashaggregates.
>
> Reviewing the code now I discovered this path could be picked out from
> the pathlist, just like it is done by
> get_cheapest_fractional_path_for_pathkeys when pathkeys is nil. This
> would need to be returned in addition to the other two paths. To
> minimize overhead, this should only be done when we possibly want to
> consider lazy hash-aggregation (there is a group clause with no
> aggregates and grouping is hashable) But this is starting to get
> pretty crufty considering that there doesn't seem to be any really
> compelling usecases for this.
>
> > Ants, do you intend to update this patch for this CommitFest?  Or at
> > all?  It seems nobody's too excited about this, so I'm not sure
> > whether it makes sense for you to put more work on it.  But please
> > advise as to your plans.
>
> If anyone thinks that this patch might be worth considering, then I'm
> prepared to do minor cleanup this CF (I saw some possibly unnecessary
> cruft in agg_fill_hash_and_retrieve). On the other hand, if you think
> the use case is too marginal to consider for inclusion then I won't
> shed a tear if this gets rejected. For me this was mostly a learning
> experience for poking around in the planner.

Honestly, I'm not sure that it's worth including this, considering the use
case...

Thanks,

Best regards,
Etsuro Fujita

> Ants Aasma
> --
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-29 02:26:42 Re: We probably need autovacuum_max_wraparound_workers
Previous Message Stephen Frost 2012-06-29 02:15:19 Re: We probably need autovacuum_max_wraparound_workers