Re: Function array_agg(array)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Ali Akbar <the(dot)apaan(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Function array_agg(array)
Date: 2014-10-25 11:25:44
Message-ID: CAFj8pRA9UYZHL7p0_qP4RyC6xgcYJ7eNWhG61Bf2xc3PZm1-eQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2014-10-25 12:20 GMT+02:00 Ali Akbar <the(dot)apaan(at)gmail(dot)com>:

> 2014-10-25 15:43 GMT+07:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>
>>
>>
>> 2014-10-25 10:16 GMT+02:00 Ali Akbar <the(dot)apaan(at)gmail(dot)com>:
>>
>>> makeArrayResult1 - I have no better name now
>>>>>
>>>>> I found one next minor detail.
>>>>>
>>>>> you reuse a array_agg_transfn function. Inside is a message
>>>>> "array_agg_transfn called in non-aggregate context". It is not correct for
>>>>> array_agg_anyarray_transfn
>>>>>
>>>>
>>> Fixed.
>>>
>>>
>>>> probably specification dim and lbs in array_agg_finalfn is useless,
>>>>> when you expect a natural result. A automatic similar to
>>>>> array_agg_anyarray_finalfn should work there too.
>>>>>
>>>>> makeMdArrayResultArray and appendArrayDatum should be marked as
>>>>> "static"
>>>>>
>>>>
>>> ok, marked all of that as static.
>>>
>>>
>>>> I am thinking so change of makeArrayResult is wrong. It is developed
>>>> for 1D array. When we expect somewhere ND result now, then we should to use
>>>> makeMdArrayResult there. So makeArrayResult should to return 1D array in
>>>> all cases. Else a difference between makeArrayResult and makeMdArrayResult
>>>> hasn't big change and we have a problem with naming.
>>>>
>>>
>>> I'm thinking it like this:
>>> - if we want to accumulate array normally, use accumArrayResult and
>>> makeArrayResult. If we accumulate scalar the result will be 1D array. if we
>>> accumulate array, the resulting dimension is incremented by 1.
>>> - if we want, somehow to affect the normal behavior, and change the
>>> dimensions, use makeMdArrayResult.
>>>
>>> Searching through the postgres' code, other than internal use in
>>> arrayfuncs.c, makeMdArrayResult is used only
>>> in src/pl/plperl/plperl.c:plperl_array_to_datum, while converting perl
>>> array to postgres array.
>>>
>>> So if somehow we will accumulate array other than in array_agg, i think
>>> the most natural way is using accumArrayResult and then makeArrayResult.
>>>
>>
>> ok, there is more variants and I can't to decide. But I am not satisfied
>> with this API. We do some wrong in structure. makeMdArrayResult is now
>> ugly.
>>
>
> One approach that i can think is we cleanly separate the structures and
> API. We don't touch existing ArrayBuildState, accumArrayResult,
> makeArrayResult and makeMdArrayResult, and we create new functions:
> ArrayBuildStateArray, accumArrayResultArray, makeArrayResultArray and
> makeArrayResultArrayCustom.
>

yes, I am thinking about separatate path, that will be joined in
constructMdArray

In reality, there are two different array builders - with different API and
better to respect it.

>
> But if we do that, the array(subselect) implementation will not be as
> simple as current patch. We must also change all the ARRAY_SUBLINK-related
> accumArrayResult call.
>
> Regarding current patch implementation, the specific typedefs are declared
> as:
> +typedef struct ArrayBuildStateScalar
> +{
> + ArrayBuildState astate;
> ...
> +} ArrayBuildStateArray;
>
> so it necessities rather ugly code like this:
> + result->elemtype = astate->astate.element_type;
>
> Can we use C11 feature of unnamed struct like this? :
>

no, what I know a C11 is prohibited

> +typedef struct ArrayBuildStateScalar
> +{
> + ArrayBuildState;
> ...
> +} ArrayBuildStateArray;
>
> so the code can be a little less ugly by doing it like this:
> + result->elemtype = astate->element_type;
>
> I don't know whether all currently supported compilers implements this
> feature..
>
>
> Can you suggest a better structure for this?
>
> If we can't settle on a better structure, i think i'll reimplement
> array_agg without the performance improvement, using deconstruct_array and
> unchanged accumArrayResult & makeMdArrayResult.
>

you can check it? We can test, how performance lost we get. As second
benefit we can get numbers for introduction new optimized array builder

Regards

Pavel

>
> Thanks,
> --
> Ali Akbar
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-10-25 12:22:19 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Alvaro Herrera 2014-10-25 11:01:21 Re: pg_background (and more parallelism infrastructure patches)