Re: Add min and max execute statement time in pg_stat_statement

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add min and max execute statement time in pg_stat_statement
Date: 2015-01-20 23:54:15
Message-ID: 54BEEAA7.8020602@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 01/20/2015 06:32 PM, David G Johnston wrote:
> Andrew Dunstan wrote
>> On 01/20/2015 01:26 PM, Arne Scheffer wrote:
>>> And a very minor aspect:
>>> The term "standard deviation" in your code stands for
>>> (corrected) sample standard deviation, I think,
>>> because you devide by n-1 instead of n to keep the
>>> estimator unbiased.
>>> How about mentioning the prefix "sample"
>>> to indicate this beiing the estimator?
>>
>> I don't understand. I'm following pretty exactly the calculations stated
>> at &lt;http://www.johndcook.com/blog/standard_deviation/&gt;
>>
>>
>> I'm not a statistician. Perhaps others who are more literate in
>> statistics can comment on this paragraph.
> I'm largely in the same boat as Andrew but...
>
> I take it that Arne is referring to:
>
> http://en.wikipedia.org/wiki/Bessel's_correction
>
> but the mere presence of an (n-1) divisor does not mean that is what is
> happening. In this particular situation I believe the (n-1) simply is a
> necessary part of the recurrence formula and not any attempt to correct for
> sampling bias when estimating a population's variance. In fact, as far as
> the database knows, the values provided to this function do represent an
> entire population and such a correction would be unnecessary. I guess it
> boils down to whether "future" queries are considered part of the population
> or whether the population changes upon each query being run and thus we are
> calculating the ever-changing population variance. Note point 3 in the
> linked Wikipedia article.
>
>

Thanks. Still not quite sure what to do, though :-) I guess in the end
we want the answer to come up with similar results to the builtin stddev
SQL function. I'll try to set up a test program, to see if we do.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-01-20 23:57:29 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Michael Paquier 2015-01-20 23:38:57 Re: Better way of dealing with pgstat wait timeout during buildfarm runs?