Re: Weighted Stats

From: David Fetter <david(at)fetter(dot)org>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Weighted Stats
Date: 2016-03-15 15:36:50
Message-ID: 20160315153650.GC26662@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 08, 2016 at 04:37:36PM +1100, Haribabu Kommi wrote:
> On Mon, Dec 21, 2015 at 1:50 PM, David Fetter <david(at)fetter(dot)org> wrote:
> > On Sun, Dec 20, 2015 at 06:13:33PM -0600, Jim Nasby wrote:
> >> On 11/2/15 5:46 PM, David Fetter wrote:
> >> >I'd like to add weighted statistics to PostgreSQL
> >>
> >> Anything happen with this? If community isn't interested, ISTM it'd be good
> >> to put this in PGXN.
> >
> > I think it's already in PGXN as an extension, and I'll get another
> > version out this early this week, as it involves mostly adding some
> > tests.
> >
> > I'll do the float8 ones for core this week, too, and unless there's a
> > really great reason to do more data types on the first pass, it should
> > be in committable shape.
>
> I reviewed the patch, following are my observations.
>
> 1. + precision</type>, <type>numeric</type>, or <type>interval</type>
>
> with interval type it is giving problem. As interval data type is not supported,
> so remove it in the list of supported inputs.

Done.

>
> 2. +float8_weighted_avg(PG_FUNCTION_ARGS)
>
> It will be helpful, if you provide some information as a function header,
> how the weighted average is calculated similar like other weighted functions.

Done.

> 3. + transvalues = check_float8_array(transarray,
> "float8_weighted_stddev_accum", 4);
>
> The second parameter to check_float8_array should be "float8_weighted_accum".

Done.

> 4. There is an OID conflict of 4066 with latest master code.

Fixed.

> 5.+ A += newvalW * ( newvalX - transvalues[2] ) / W;
> + CHECKFLOATVAL(A, isinf(newvalW) || isinf(newvalX - transvalues[2])
> || isinf(1.0/W), true);
>
> + Q += newvalW * (newvalX - transvalues[2]) * (newvalX - A);
> + CHECKFLOATVAL(A, isinf(newvalX - transvalues[3]) || isinf(newvalX -
> A) || isinf(1.0/W), true);
>
>
> Is the need of calculation also needs to be passed to CHECKFLOATVAL?
> Just passing
> the variables involved in the calculation isn't enough? If expressions
> are required then
> it should be something as follows?
>
> CHECKFLOATVAL(A, isinf(transvalues[2]) || isinf(newvalW) ||
> isinf(newvalX - transvalues[2]) || isinf(1.0/W), true);
>
> CHECKFLOATVAL(Q, isinf(transvalues[3]) || isinf(newvalX -
> transvalues[2]) || isinf(newvalX - A) || isinf(1.0/W), true);

Done.

Please find attached a patch that uses the float8 version to cover the
numeric types.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
weighted_stats_002.diff text/plain 18.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2016-03-15 15:43:27 Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
Previous Message Mark Dilger 2016-03-15 15:35:56 Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check