Re: count_nulls(VARIADIC "any")

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: count_nulls(VARIADIC "any")
Date: 2016-01-04 04:23:56
Message-ID: CAFj8pRBkMU=DNdBamyeKqVeydJMhGOmL0N1q-aEfHqbd_JPkwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-01-03 22:49 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/3/16 2:37 PM, Pavel Stehule wrote:
>
>> + /* num_nulls(VARIADIC NULL) is defined as NULL */
>> + if (PG_ARGISNULL(0))
>> + return false;
>>
>
> Could you add to the comment explaining why that's the desired behavior?
>

This case should be different than num_nulls(VARIADIC ARRAY[..]) - this
situation is really equivalent of missing data and NULL is correct answer.
It should not be too clean in num_nulls, but when it is cleaner for
num_notnulls. And more, it is consistent with other variadic functions in
Postgres: see concat_internal and text_format.

>
> + /*
>> + * Non-null argument had better be an array. We assume
>> that any call
>> + * context that could let get_fn_expr_variadic return
>> true will have
>> + * checked that a VARIADIC-labeled parameter actually is
>> an array. So
>> + * it should be okay to just Assert that it's an array
>> rather than
>> + * doing a full-fledged error check.
>> + */
>> +
>> Assert(OidIsValid(get_base_element_type(get_fn_expr_argtype(fcinfo->flinfo,
>> 0))));
>>
>
> Erm... is that really the way to verify that what you have is an array?
> ISTM there should be a macro for that somewhere...
>

really, it is. It is used more time. Although I am not against some macro,
I don't think so it is necessary. The macro should not be too shorter than
this text.

>
> + /* OK, safe to fetch the array value */
>> + arr = PG_GETARG_ARRAYTYPE_P(0);
>> +
>> + ndims = ARR_NDIM(arr);
>> + dims = ARR_DIMS(arr);
>> + nitems = ArrayGetNItems(ndims, dims);
>> +
>> + bitmap = ARR_NULLBITMAP(arr);
>> + if (bitmap)
>> + {
>> + bitmask = 1;
>> +
>> + for (i = 0; i < nitems; i++)
>> + {
>> + if ((*bitmap & bitmask) == 0)
>> + count++;
>> +
>> + bitmask <<= 1;
>> + if (bitmask == 0x100)
>> + {
>> + bitmap++;
>> + bitmask = 1;
>>
>
> For brevity and example sake it'd probably be better to just use the
> normal iterator, unless there's a serious speed difference?
>

The iterator does some memory allocations and some access to type cache.
Almost all work of iterator is useless for this case. This code is
developed by Marko, but I agree with this design. Using the iterator is big
gun for this case. I didn't any performance checks, but it should be
measurable for any varlena arrays.

Regards

Pavel

> In the unit test, I'd personally prefer just building a table with the
> test cases and the expected NULL/NOT NULL results, at least for all the
> calls that would fit that paradigm. That should significantly reduce the
> size of the test. Not a huge deal though...
>
> Also, I don't think anything is testing multiples of whatever value... how
> 'bout change the generate_series CASE statement to >40 instead of <>40?
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-01-04 04:25:04 Re: 9.5 BLOCKER: regrole and regnamespace and quotes
Previous Message Jim Nasby 2016-01-04 04:20:32 Re: 9.5 BLOCKER: regrole and regnamespace and quotes