Re: Question about using AggCheckCallContext in a C function

From: Matt Solnit <msolnit(at)soasta(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Question about using AggCheckCallContext in a C function
Date: 2013-08-14 04:44:07
Message-ID: EFE1DA20-A978-44D1-B56E-930F79578627@soasta.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Aug 12, 2013, at 12:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Matt Solnit <msolnit(at)soasta(dot)com> writes:
>> 2. The function seems to work consistently when I do a SELECT
>> SUM(mycol) without any GROUP BY. It's only when I add grouping that
>> the failures happen. I'm not sure if this is a real clue or a red
>> herring.
>
> That isn't enormously surprising, since the memory management for
> the transition values is different in the two cases.
>
>> Finally, can you tell me what precisely happens when you call
>> datumCopy() with ArrayType? If it's only returning a copy of
>> the TOAST reference, then how is it safe for the transition function
>> to modify the content? I'm probably *completely* misunderstanding
>> how this works, so I would love to be enlightened :-).
>
> You're right, datumCopy() won't expand a TOAST reference. What does
> expand it is PG_GETARG_ARRAYTYPE_P(). So if you have a case where the
> system picks up a copy of an array input that happens to be toasted,
> it's the GETARG step in the next invocation of the aggregate transition
> function that expands the TOAST reference, and then after that you have an
> in-memory copy that's safe to modify. Maybe you're missing that somehow?
> The code fragment you showed looked okay but ...
>
> regards, tom lane

I think I figured it out. The problem is this line:

Datum *arrayData1, *arrayData2;

Datum* was correct when I first started this journey, using deconstruct_array(),
but is incorrect when accessing the array's content directly using
ARR_DATA_PTR(). Changing these to int* fixes the problem, at least
on all the systems I've tried so far.

I've been wondering why the broken code worked without a GROUP BY,
and I think it was just dumb luck. With no GROUP BY, I was only
overrunning a single buffer, and maybe the effects were not
immediately apparent. With GROUP BY, however, there's a buffer
overrun for each group, and each one increases the chance of doing
something catastrophic.

Sincerely,
Matt Solnit

In response to

Browse pgsql-general by date

  From Date Subject
Next Message M Tarkeshwar Rao 2013-08-14 05:40:47 Re: Need some basic information
Previous Message Robert James 2013-08-14 03:34:42 Re: What type of index do I need for this JOIN?