Re: [HACKERS] Standard Deviation function.

From: dg(at)illustra(dot)com (David Gould)
To: stuporg(at)erols(dot)com (Stupor Genius)
Cc: pgsql-hackers(at)postgreSQL(dot)org, maillist(at)candle(dot)pha(dot)pa(dot)us
Subject: Re: [HACKERS] Standard Deviation function.
Date: 1998-06-05 08:15:46
Message-ID: 9806050815.AA04919@hawk.illustra.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> > > > I really need a Standard Deviation aggregate function...
> > >
> > > I thought about this a long time ago and had an idea but never
> > > got around to trying to implement it. I was going to have some
> > > functions that worked on a structure of two doubles to track
> > > the sum and square instead of using only one simple type.
> >
> > I remember talking about this to someone, and the problem is that you
> > needed the average WHILE scanning through the table, which required two
> > passes, which the aggregate system is not designed to do. I may be
> > wrong on this, though.
>
> I had asked you how to calculate this and the variance early last
> year. One (I think the variance) was two-pass because of the need
> for the average, but I thought the StdDev would work with the struct.
>
> Been a while and I still haven't configured #(*&^ FreeBSD ppp yet.

The Perl Module "Statistics/Descriptive" has on the fly variance calculation.

sub add_data {
my $self = shift; ##Myself
my $oldmean;
my ($min,$mindex,$max,$maxdex);

##Take care of appending to an existing data set
$min = (defined ($self->{min}) ? $self->{min} : $_[0]);
$max = (defined ($self->{max}) ? $self->{max} : $_[0]);
$maxdex = $self->{maxdex} || 0;
$mindex = $self->{mindex} || 0;

##Calculate new mean, pseudo-variance, min and max;
foreach (@_) {
$oldmean = $self->{mean};
$self->{sum} += $_;
$self->{count}++;
if ($_ >= $max) {
$max = $_;
$maxdex = $self->{count}-1;
}
if ($_ <= $min) {
$min = $_;
$mindex = $self->{count}-1;
}
$self->{mean} += ($_ - $oldmean) / $self->{count};
$self->{pseudo_variance} += ($_ - $oldmean) * ($_ - $self->{mean});
}

$self->{min} = $min;
$self->{mindex} = $mindex;
$self->{max} = $max;
$self->{maxdex} = $maxdex;
$self->{sample_range} = $self->{max} - $self->{min};
if ($self->{count} > 1) {
$self->{variance} = $self->{pseudo_variance} / ($self->{count} -1);
$self->{standard_deviation} = sqrt( $self->{variance});
}
return 1;
}

-dg

David Gould dg(at)illustra(dot)com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Don't worry about people stealing your ideas. If your ideas are any
good, you'll have to ram them down people's throats." -- Howard Aiken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maarten Boekhold 1998-06-05 08:36:26 Re: [GENERAL] Re: [HACKERS] NEW POSTGRESQL LOGOS
Previous Message Andreas Zeugswetter 1998-06-05 08:07:40 AW: [HACKERS] Standard Deviation function.