Re: hash agg is slower on wide tables?

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash agg is slower on wide tables?
Date: 2015-02-22 12:22:11
Message-ID: 20150222122211.GD6093@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-22 10:33:16 +0000, Andrew Gierth wrote:
> This is, if I'm understanding the planner logic right, physical-tlist
> optimization; it's faster for a table scan to simply return the whole
> row (copying nothing, just pointing to the on-disk tuple) and let
> hashagg pick out the columns it needs, rather than for the scan to run a
> projection step just to select specific columns.
>
> If there's a Sort step, this isn't done because Sort neither evaluates
> its input nor projects new tuples on its output, it simply accepts the
> tuples it receives and returns them with the same structure. So now it's
> important to have the node providing input to the Sort projecting out
> only the minimum required set of columns.
>
> Why it's slower on the wider table... that's less obvious.

It's likely to just be tuple deforming. I've not tried it but I'd bet
you'll see slot_deform* very high in the profile. For the narrow table
only two attributes need to be extracted, for the wider one everything
up to a11 will get extracted.

I've wondered before if we shouldn't use the caching via
slot->tts_values so freely - if you only use a couple values from a wide
tuple the current implementation really sucks if those few aren't at the
beginning of the tuple.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-02-22 13:04:18 Re: Replication identifiers, take 4
Previous Message Pavel Stehule 2015-02-22 11:19:39 Re: proposal: searching in array function - array_position