Re: Built-in binning functions

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Built-in binning functions
Date: 2014-07-16 08:04:13
Message-ID: 53C631FD.1020004@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/07/14 02:14, Tom Lane wrote:
> Petr Jelinek <petr(at)2ndquadrant(dot)com> writes:
>> here is a patch implementing varwidth_bucket (naming is up for
>> discussion) function which does binning with variable bucket width.
>
> I didn't see any discussion of the naming question in this thread.
> I'd like to propose that it should be just "width_bucket()"; we can
> easily determine which function is meant, considering that the
> SQL-spec variants don't take arrays and don't even have the same
> number of actual arguments.

I did mention in submission that the names are up for discussion, I am
all for naming it just width_bucket.

>
> So given plain integer arguments, we'll invoke the float8 version not the
> int8 version, which may be undesirable. The same for int2 arguments.
> We could fix that by adding bespoke int4 and maybe int2 variants, but

Hmm, yeah I don't love the idea of having 3 functions just to cover
integer datatypes :(.

> TBH, I'm not sure that the specific-type functions are worth the trouble.
> Maybe we should just have one generic function, and take the trouble to
> optimize its array-subscripting calculations for either fixed-length or
> variable-length array elements? Have you got performance measurements
> demonstrating that multiple implementations really buy enough to justify
> the extra code?

The performance difference is about 20% (+/- few depending on the array
size), I don't know if that's bad enough to warrant type-specific
implementation. I personally don't know how to make the generic
implementation much faster than it is now, except maybe by turning it
into aggregate which would cache the deconstructed version of the array,
but that change semantics quite a bit and is probably not all that
desirable.

>
> Also, I'm not convinced by this business of throwing an error for a
> NULL array element. Per spec, null arguments to width_bucket()
> produce a null result, not an error --- shouldn't this flavor act
> similarly? In any case I think the test needs to use
> array_contains_nulls() not just ARR_HASNULL.

I am not against returning NULL for NULL array, I would still like to
error on array that has values + NULL somewhere in the middle though.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-07-16 08:04:59 [BUG?] tuples from file_fdw has strange xids.
Previous Message Andres Freund 2014-07-16 08:00:32 Re: Deadlocks in HS (on 9.0 :( )