Skip site navigation (1) Skip section navigation (2)

Re: LIMIT confuses the planner

From: Kouber Saparev <kouber(at)saparev(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: LIMIT confuses the planner
Date: 2009-02-24 17:08:51
Message-ID: 49A429A3.9010803@saparev.com (view raw or flat)
Thread:
Lists: pgsql-performance
Tom Lane wrote:
> Kouber Saparev <kouber(at)saparev(dot)com> writes:
>> Now the planner believes there're 910 rows, which is a bit closer to the 
>> real data:
> 
>> swing=# select avg(length) from (select username, count(*) as length 
>> from login_attempt group by username) as freq;
>>           avg
>> ----------------------
>>   491.6087310427555479
>> (1 row)
> 
> Hmph, that's still not real good.  Ideally it should be estimating
> *less* than the average frequency, because the estimate is made after
> excluding all the most-common-values, which evidently 'kouber' is not
> one of.  I suppose there's quite a large number of infrequently-seen
> usernames and the ndistinct estimate is much less than reality?  (Look
> at the pg_stats row for this column.)  It might be worth going all the
> way to stats target 1000 for this column.


I altered the statistics for that column to 1000, so now the planner 
assumes exactly 492 rows for the fore-mentioned query, which is indeed 
the average. It never went *less* than that value, it was always higher, 
i.e. for a statistics value of 600, it was 588, for 800, it became 540.

The current value of n_distinct (given statistics=1000) is:

db=# SELECT n_distinct FROM pg_stats WHERE tablename='login_attempt' AND 
attname='username';
  n_distinct
------------
        5605
(1 row)

db=# SELECT COUNT(DISTINCT username) FROM login_attempt;
  count
-------
  23391
(1 row)


In fact, what is n_distinct standing for, apart the famous formula:
n*d / (n - f1 + f1*n/N)

;-)

Regards,
-- 
Kouber Saparev
http://kouber.saparev.com

In response to

Responses

pgsql-performance by date

Next:From: Farhan HusainDate: 2009-02-24 19:51:38
Subject: Re: Abnormal performance difference between Postgres and MySQL
Previous:From: Ross J. ReedstromDate: 2009-02-24 17:02:01
Subject: Re: TCP network cost

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group