Re: GROUP BY on a large table -- an idea

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Dawid Kuroczko <qnex42(at)gmail(dot)com>
Subject: Re: GROUP BY on a large table -- an idea
Date: 2006-10-15 09:16:12
Message-ID: 4531FC5C.20703@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Dawid,

Dawid Kuroczko wrote:

> The hybrid approach means: sort as much as you can without spilling to
> disk, then aggregate and store aggregate state variables in safe place
> (like a "tree" above), get more tuples from the table, sort them, update
> aggregate state variables, lather, rince, repeat.

For this to work, you need an additional function in the aggregate
definition, that allows to merge two states into one, for the "update
aggregate state variables" step.

Recently, there was some discussion that the Bizgres MPP people already
have such a function for merging states of different backend processes,
and that the query planner could benefit from such a function e. G. in
case of UNION or table partitioning.

Maybe we should come up with an exact definition of syntax and semantics
of this function, that satisfies all the needs of the three usecases above?

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Schaber 2006-10-15 09:42:45 Re: SQL functions, INSERT/UPDATE/DELETE RETURNING, and
Previous Message Bruce Momjian 2006-10-15 03:11:10 Re: [HACKERS] large object regression tests