Re: hash agg is slower on wide tables?

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash agg is slower on wide tables?
Date: 2015-02-22 15:11:30
Message-ID: 20150222151130.GC21496@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-22 09:58:31 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > I've wondered before if we shouldn't use the caching via
> > slot->tts_values so freely - if you only use a couple values from a wide
> > tuple the current implementation really sucks if those few aren't at the
> > beginning of the tuple.
>
> Don't see how you expect to get a win that way. Visiting column k
> requires crawling over columns 1..k-1 in practically all cases.
> You could maybe save a cycle or so by omitting the physical store
> into the Datum array, assuming that you never did need the column
> value later ... but the extra bookkeeping for more complicated
> tracking of which columns had been extracted would eat that savings
> handily.

Depends a bit on the specifics. In this case attcacheoff would allow you
direct access, which surely is going to be more efficient. I'm not sure
how frequent that happens in practice.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-02-22 15:12:34 Re: Replication identifiers, take 4
Previous Message Peter Eisentraut 2015-02-22 15:03:59 Re: Replication identifiers, take 4