Quick Links

Re: [HACKERS] Bad n_distinct estimation; hacks suggested?

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	"Dave Held" <dave(dot)held(at)arraysg(dot)com>
Cc:	"pgsql-perform" <pgsql-performance(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Date:	2005-04-27 17:16:48
Message-ID:	87ekcw3zsv.fsf@stark.xeocode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

"Dave Held" <dave(dot)held(at)arraysg(dot)com> writes:

> > Actually, it's more to characterize how large of a sample
> > we need. For example, if we sample 0.005 of disk pages, and
> > get an estimate, and then sample another 0.005 of disk pages
> > and get an estimate which is not even close to the first
> > estimate, then we have an idea that this is a table which
> > defies analysis based on small samples.
>
> I buy that.

Better yet is to use the entire sample you've gathered of .01 and then perform
analysis on that sample to see what the confidence interval is. Which is
effectively the same as what you're proposing except looking at every possible
partition.

Unfortunately the reality according to the papers that were sent earlier is
that you will always find the results disappointing. Until your sample is
nearly the entire table your estimates for n_distinct will be extremely
unreliable.

--
greg

In response to

Re: [HACKERS] Bad n_distinct estimation; hacks suggested? at 2005-04-27 15:47:36 from Dave Held

Responses

Re: [HACKERS] Bad n_distinct estimation; hacks suggested? at 2005-04-28 17:44:37 from Marko Ristola

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2005-04-27 17:37:38	Re: [HACKERS] Continue transactions after errors in psql
Previous Message	Tom Lane	2005-04-27 17:16:40	Re: [HACKERS] Continue transactions after errors in psql

Browse pgsql-performance by date

	From	Date	Subject
Next Message	John A Meinel	2005-04-27 17:20:37	Re: Final decision
Previous Message	Joshua D. Drake	2005-04-27 17:01:55	Re: Final decision