Skip site navigation (1) Skip section navigation (2)

Re: planner stats

From: Neil Conway <neilc(at)samurai(dot)com>
To: Roman Neuhauser <neuhauser(at)chello(dot)cz>
Cc: pgsql-docs(at)postgresql(dot)org
Subject: Re: planner stats
Date: 2005-02-03 06:52:24
Message-ID: 1107413544.26960.74.camel@localhost.localdomain (view raw, whole thread or download thread mbox)
Lists: pgsql-docs
On Wed, 2005-02-02 at 11:02 +0100, Roman Neuhauser wrote:
> the SET STATISTICS description under ALTER TABLE doesn't explain how
> to choose the right target (is there a formula?)

There's no simple formula; in particular, it's not merely a matter of
raising the target for the columns of a large table. The more irregular
the distribution of the data in a particular column, the higher the
target should be (as the docs state, and as I've quoted below). I would
guess most people figure out how to set per-column stats by trial and
error, by looking at how accurate the planner's estimates are for
queries involving the column in question.

> it says "For more information on the use of statistics by the PostgreSQL
> query planner, refer to Section 13.2.", but that section *doesn't* talk
> about "the use of statistics by the query planner"; it doesn't hint
> a method to choose a good target either

I disagree:

        The amount of information stored in pg_statistic, in particular
        the maximum number of entries in the most_common_vals and
        histogram_bounds arrays for each column, can be set on a
        column-by-column basis using the ALTER TABLE SET STATISTICS
        command, or globally by setting the default_statistics_target
        configuration variable. The default limit is presently 10
        entries. Raising the limit may allow more accurate planner
        estimates to be made, particularly for columns with irregular
        data distributions, at the price of consuming more space in
        pg_statistic and slightly more time to compute the estimates.
        Conversely, a lower limit may be appropriate for columns with
        simple data distributions.

Suggestions for specific improvements are welcome (a patch against the
SGML is easiest).


In response to

pgsql-docs by date

Next:From: Mark KirkwoodDate: 2005-02-03 23:06:08
Subject: Instructions for Linux ipc config
Previous:From: Michael FuhrDate: 2005-02-03 06:38:17
Subject: Re: Typo in performance-tips.html

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group