Re: About Custom Aggregates, C Extensions and Memory

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marthin Laubscher <postgres(at)lobeshare(dot)co(dot)za>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: About Custom Aggregates, C Extensions and Memory
Date: 2025-08-15 15:35:06
Message-ID: 608868.1755272106@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marthin Laubscher <postgres(at)lobeshare(dot)co(dot)za> writes:
> A custom aggregate seems the best vehicle for what I seek to implement. Given the processing involved, it’s probably best written in C.
> That makes the aggregate and opaque value encoded and compressed to an internal format that allows direct equality testing and comparison. For everything else it needs to be decoded into memory, worked on and then encoded into a value as expected by the database ecosystem.
> The challenge being that decoding and encoding presents a massive overhead (easily 2 orders of magnitude or more) compared to the lightning fast operations to e.g. add or remove a value from the aggregate while in memory, killing performance and limiting potential.

Yeah. What you want is to declare the aggregate as having transtype
"internal" (which basically means that ExecAgg will store a pointer
for you) and make that pointer point to a data structure kept in the
"aggcontext", which will have a suitable lifespan. json_agg() might
be a suitable example to look at. Keep in mind that the finalfn
mustn't modify the stored state, as there are optimizations where
it'll be applied more than once.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2025-08-15 15:36:47 Re: shmem_startup_hook called twice on Windows
Previous Message Sami Imseih 2025-08-15 15:33:31 Re: shmem_startup_hook called twice on Windows