Re: Spilling hashed SetOps and aggregates to disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Spilling hashed SetOps and aggregates to disk
Date: 2018-06-06 13:58:16
Message-ID: 91cf538c-b724-734d-6f52-55a3c4bc7638@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/05/2018 07:39 PM, David Fetter wrote:
> On Tue, Jun 05, 2018 at 01:27:01PM -0400, Tom Lane wrote:
>> David Fetter <david(at)fetter(dot)org> writes:
>>> On Tue, Jun 05, 2018 at 02:56:23PM +1200, David Rowley wrote:
>>>> True. Although not all built in aggregates have those defined.
>>
>>> Just out of curiosity, which ones don't? As of
>>> 3f85c62d9e825eedd1315d249ef1ad793ca78ed4, pg_aggregate has both of
>>> those as NOT NULL.
>>
>> NOT NULL isn't too relevant; that's just protecting the fixed-width
>> nature of the catalog rows. What's important is which ones are zero.
>
> Thanks for helping me understand this better.
>
>> # select aggfnoid::regprocedure, aggkind from pg_aggregate where (aggserialfn=0 or aggdeserialfn=0) and aggtranstype = 'internal'::regtype;
>> aggfnoid | aggkind
>> ------------------------------------------------------+---------
>> [snip]
>> (19 rows)
>>
>> Probably the ordered-set/hypothetical ones aren't relevant for this
>> issue.
>>
>> Whether or not we feel like fixing the above "normal" aggs for this,
>> the patch would have to not fail on extension aggregates that don't
>> support serialization.
>
> Could there be some kind of default serialization with reasonable
> properties?
>

Not really, because the aggregates often use "internal" i.e. a pointer
referencing whothehellknowswhat, and how do you serialize/deserialize
that? The other issue is that serialize/deserialize is only a part of a
problem - you also need to know how to do "combine", and not all
aggregates can do that ... (certainly not in universal way).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-06-06 14:01:29 Re: Spilling hashed SetOps and aggregates to disk
Previous Message Tomas Vondra 2018-06-06 13:52:52 Re: POC: GROUP BY optimization