Skip site navigation (1) Skip section navigation (2)

Re: Bad n_distinct estimation; hacks suggested?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>
Cc: pgsql-perform <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Bad n_distinct estimation; hacks suggested?
Date: 2005-04-22 20:36:08
Message-ID: 200504221336.08325.josh@agliodbs.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
> > Solaris is unknown to me. Maybe the used random number generator there
> > isn't good enough?
>
> Hmmm.  Good point.  Will have to test on Linux.

Nope:

Linux 2.4.20:

test=# select tablename, attname, n_distinct from pg_stats where tablename = 
'web_site_activity_fa';
      tablename       |       attname       | n_distinct
----------------------+---------------------+------------
 web_site_activity_fa | session_id          |     626127

test=# select count(distinct session_id) from web_site_activity_fa;
  count
---------
 3174813
(1 row)

... I think the problem is in our heuristic sampling code.  I'm not the first 
person to have this kind of a problem.  Will be following up with tests ...

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

pgsql-performance by date

Next:From: Mischa SandbergDate: 2005-04-22 20:53:50
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon
Previous:From: Josh BerkusDate: 2005-04-22 18:52:51
Subject: Re: Bad n_distinct estimation; hacks suggested?

pgsql-hackers by date

Next:From: Josh BerkusDate: 2005-04-22 20:46:40
Subject: Re: Bitmap scans vs. the statistics views
Previous:From: Jan WieckDate: 2005-04-22 20:35:38
Subject: Re: Bitmap scans vs. the statistics views

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group