Quick Links

Re: [PATCHES] Better default_statistics_target

From:	"Christopher Browne" <cbbrowne(at)gmail(dot)com>
To:	Decibel! <decibel(at)decibel(dot)org>
Cc:	"Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com>, "Greg Sabino Mullane" <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] Better default_statistics_target
Date:	2008-01-28 23:14:05
Message-ID:	d6d6637f0801281514v18c5119cwa375774f100760d8@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

On Dec 6, 2007 6:28 PM, Decibel! <decibel(at)decibel(dot)org> wrote:
> FWIW, I've never seen anything but a performance increase or no change
> when going from 10 to 100. In most cases there's a noticeable
> improvement since it's common to have over 100k rows in a table, and
> there's just no way to capture any kind of a real picture of that with
> only 10 buckets.

I'd be more inclined to try to do something that was at least somewhat
data aware.

The "interesting theory" that I'd like to verify if I had a chance
would be to run through a by-column tuning using a set of heuristics.
My "first order approximation" would be:

- If a column defines a unique key, then we know there will be no
clustering of values, so no need to increase the count...

- If a column contains a datestamp, then the distribution of values is
likely to be temporal, so no need to increase the count...

- If a column has a highly constricted set of values (e.g. - boolean),
then we might *decrease* the count.

- We might run a query that runs across the table, looking at
frequencies of values, and if it finds a lot of repeated values, we'd
increase the count.

That's a bit "hand-wavy," but that could lead to both increases and
decreases in the histogram sizes. Given that, we can expect the
overall stat sizes to not forcibly need to grow *enormously*, because
we can hope for there to be cases of shrinkage.

--
http://linuxfinances.info/info/linuxdistributions.html
"The definition of insanity is doing the same thing over and over and
expecting different results." -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

In response to

Re: [PATCHES] Better default_statistics_target at 2007-12-06 18:28:13 from Decibel!

Responses

Re: [PATCHES] Better default_statistics_target at 2008-01-30 22:58:48 from Decibel!

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Davis	2008-01-28 23:41:05	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Previous Message	Heikki Linnakangas	2008-01-28 23:13:18	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Jeff Davis	2008-01-28 23:41:05	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Previous Message	Heikki Linnakangas	2008-01-28 23:13:18	Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable