Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] Standard Deviation function.

From: dg(at)illustra(dot)com (David Gould)
To: stuporg(at)erols(dot)com (Stupor Genius)
Cc: pgsql-hackers(at)postgreSQL(dot)org, maillist(at)candle(dot)pha(dot)pa(dot)us
Subject: Re: [HACKERS] Standard Deviation function.
Date: 1998-06-05 08:15:46
Message-ID: 9806050815.AA04919@hawk.illustra.com (view raw or flat)
Thread:
Lists: pgsql-hackers
> 
> > > > I really need a Standard Deviation aggregate function...
> > > 
> > > I thought about this a long time ago and had an idea but never
> > > got around to trying to implement it.  I was going to have some
> > > functions that worked on a structure of two doubles to track
> > > the sum and square instead of using only one simple type.
> > 
> > I remember talking about this to someone, and the problem is that you
> > needed the average WHILE scanning through the table, which required two
> > passes, which the aggregate system is not designed to do.  I may be
> > wrong on this, though.
> 
> I had asked you how to calculate this and the variance early last
> year.  One (I think the variance) was two-pass because of the need
> for the average, but I thought the StdDev would work with the struct.
> 
> Been a while and I still haven't configured #(*&^ FreeBSD ppp yet.

The Perl Module "Statistics/Descriptive" has on the fly variance calculation.

sub add_data {
  my $self = shift;  ##Myself
  my $oldmean;
  my ($min,$mindex,$max,$maxdex);

  ##Take care of appending to an existing data set
  $min    = (defined ($self->{min}) ? $self->{min} : $_[0]);
  $max    = (defined ($self->{max}) ? $self->{max} : $_[0]);
  $maxdex = $self->{maxdex} || 0;
  $mindex = $self->{mindex} || 0;

  ##Calculate new mean, pseudo-variance, min and max;
  foreach (@_) {
    $oldmean = $self->{mean};
    $self->{sum} += $_;
    $self->{count}++;
    if ($_ >= $max) {
      $max = $_;
      $maxdex = $self->{count}-1;
    }
    if ($_ <= $min) {
      $min = $_;
      $mindex = $self->{count}-1;
    }
    $self->{mean} += ($_ - $oldmean) / $self->{count};
    $self->{pseudo_variance} += ($_ - $oldmean) * ($_ - $self->{mean});
  }

  $self->{min}          = $min;
  $self->{mindex}       = $mindex;
  $self->{max}          = $max;
  $self->{maxdex}       = $maxdex;
  $self->{sample_range} = $self->{max} - $self->{min};
  if ($self->{count} > 1) {
    $self->{variance}     = $self->{pseudo_variance} / ($self->{count} -1);
    $self->{standard_deviation}  = sqrt( $self->{variance});
  }
  return 1;
}


-dg

David Gould            dg(at)illustra(dot)com           510.628.3783 or 510.305.9468 
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
"Don't worry about people stealing your ideas.  If your ideas are any
 good, you'll have to ram them down people's throats." -- Howard Aiken

In response to

pgsql-hackers by date

Next:From: Maarten BoekholdDate: 1998-06-05 08:36:26
Subject: Re: [GENERAL] Re: [HACKERS] NEW POSTGRESQL LOGOS
Previous:From: Andreas ZeugswetterDate: 1998-06-05 08:07:40
Subject: AW: [HACKERS] Standard Deviation function.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group