Re: Simple postgresql.conf wizard

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Decibel!" <decibel(at)decibel(dot)org>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Mark Wong" <markwkm(at)gmail(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Simple postgresql.conf wizard
Date: 2008-11-26 01:18:59
Message-ID: D425483C2C5C9F49B5B7A41F89441547010012B4@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: Greg Stark [mailto:greg(dot)stark(at)enterprisedb(dot)com] On Behalf Of
> Gregory Stark
> Sent: Tuesday, November 25, 2008 5:06 PM
> To: Decibel!
> Cc: Tom Lane; Dann Corbit; Robert Haas; Bruce Momjian; Mark Wong;
> Heikki Linnakangas; Josh Berkus; Greg Smith; pgsql-
> hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Simple postgresql.conf wizard

[snip]

> As Dann said, "the idea that there IS a magic number is the problem".
> *Any*
> value of default_stats_target will "cause" problems. Some columns will
> always
> have skewed data sets which require unusually large samples, but most
> won't
> and the system will run faster with a normal sample size for that
> majority.

No, it was somebody smarter than me who said that.

My idea was to create some kind of table which shows curves for
different values and then users will have some sort of basis for
choosing.
Of course, the guy who has 40 tables in his join with an average of 7
indexes on each table (each table containing millions of rows) and a
convoluted WHERE clause will have different needs than someone who has
simple queries and small data loads. The quality of the current
statistical measures stored will also affect the intelligence of the
query preparation process, I am sure. I do have a guess that larger and
more expensive queries can probably benefit more from larger samples
(this principle is used in sorting, for instance, where the sample I
collect to estimate the median might grow as {for instance} the log of
the data set size).

P.S.
I also do not believe that there is any value that will be the right
answer. But a table of data might be useful both for people who want to
toy with altering the values and also for those who want to set the
defaults. I guess that at one time such a table was generated to
produce the initial estimates for default values.
[snip]

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-11-26 01:22:48 Re: [bugfix] DISCARD ALL does not release advisory locks
Previous Message Gregory Stark 2008-11-26 01:06:13 Re: Simple postgresql.conf wizard