Re: Problem with n_distinct being consistently inaccurate.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: nickf(at)ontko(dot)com
Cc: "pgsql-admin" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Problem with n_distinct being consistently inaccurate.
Date: 2003-09-23 17:37:16
Message-ID: 20330.1064338636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

"Nick Fankhauser" <nickf(at)ontko(dot)com> writes:
> So the question is- how can I get a better estimate of n_distinct from
> analyze?
> If I alter the stats target as high as it will go, I get closer, but it
> still shows the index to be about 1/2 as selective as it actually is:

AFAIK, estimating number of distinct values from a small sample is
inherently an ill-conditioned problem. You should probably be happy
it can get within a factor of 2 ;-).

You could try sticking the correct n_distinct into pg_statistic by hand
just to see if it really does change the plan, but I'd like to think
that getting within a factor of 2 is good enough. If it's not, then we
probably ought to look for ways to avoid using number-of-distinct-values
statistics altogether, because we'll seldom have a hard value for it.

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message pierre bach 2003-09-23 17:37:24 vacuum failed - pgtoast not btree
Previous Message Priya G 2003-09-23 17:16:48 Re: help needed!!!