Re: how to identify outliers

From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: how to identify outliers
Date: 2009-10-27 23:58:23
Message-ID: 4AE7891F.8080402@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Rhys A.D. Stewart wrote:
> Hey all,
> I have the following table: data(pnum text, distance float8, route text).
> I would like to remove the outliers in distance, i.e. lets say i get
> the avg dist of pnum for each route and the std deviation of the
> distance what is the best way to identify the outliers?
>

i dunno. brute force, two passes, one to identify the MIN and MAX of
the values, then another SELECT avg(value) .... WHERE (....) AND val
!= minval AND val != maxval.

you could probably do something with a standard deviation that is more
accurate for large sets than just tossing the 2 outliers.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tatsuo Ishii 2009-10-28 00:25:53 Re: PHP + PDO + PGPOOL = Segmentation fault
Previous Message Alvaro Herrera 2009-10-27 23:29:39 Re: auto truncate/vacuum full