Quick Links

Re: Distinct-Sampling (Gibbons paper) for Postgres

From:	a3a18850(at)telus(dot)net
To:	pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Cc:	Mischa(at)telus(dot)net, Sandberg(at)telus(dot)net
Subject:	Re: Distinct-Sampling (Gibbons paper) for Postgres
Date:	2005-04-29 05:10:18
Message-ID:	1114751418.4271c1ba12544@webmail.telus.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

Well, this guy has it nailed. He cites Flajolet and Martin, which was (I
thought) as good as you could get with only a reasonable amount of memory per
statistic. Unfortunately, their hash table is a one-shot deal; there's no way
to maintain it once the table changes. His incremental update doesn't degrade
as the table changes. If there isn't the same wrangle of patent as with the
ARC algorithm, and if the existing stats collector process can stand the extra
traffic, then this one is a winner.

Many thanks to the person who posted this reference in the first place; so
sorry I canned your posting and can't recall your name.

Now, if we can come up with something better than the ARC algorithm ...

In response to

Re: [HACKERS] Bad n_distinct estimation; hacks suggested? at 2005-04-27 13:43:31 from Andrew Dunstan

Responses

Re: Distinct-Sampling (Gibbons paper) for Postgres at 2005-04-29 05:22:51 from Josh Berkus

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Josh Berkus	2005-04-29 05:22:51	Re: Distinct-Sampling (Gibbons paper) for Postgres
Previous Message	Christopher Browne	2005-04-29 04:57:58	Re: Feature freeze date for 8.1

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Josh Berkus	2005-04-29 05:22:51	Re: Distinct-Sampling (Gibbons paper) for Postgres
Previous Message	Michael Fuhr	2005-04-29 03:28:34	Re: index on different types