Re: hash agg is slower on wide tables?

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash agg is slower on wide tables?
Date: 2015-02-22 10:33:16
Message-ID: 87fv9ycldp.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Pavel" == Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:

Pavel> why we read all columns from t1?
[...]
Pavel> so it looks so hashagg doesn't eliminate source columns well

I don't think it's supposed to eliminate them.

This is, if I'm understanding the planner logic right, physical-tlist
optimization; it's faster for a table scan to simply return the whole
row (copying nothing, just pointing to the on-disk tuple) and let
hashagg pick out the columns it needs, rather than for the scan to run a
projection step just to select specific columns.

If there's a Sort step, this isn't done because Sort neither evaluates
its input nor projects new tuples on its output, it simply accepts the
tuples it receives and returns them with the same structure. So now it's
important to have the node providing input to the Sort projecting out
only the minimum required set of columns.

Why it's slower on the wider table... that's less obvious.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-22 10:41:07 Re: Enforce creation of destination folders for source files in pg_regress (Was: pg_regress writes into source tree)
Previous Message Venkata Balaji N 2015-02-22 10:03:08 Re: Redesigning checkpoint_segments