Re: How to tell ANALYZE to collect statistics from the whole table?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: AlexK987 <alex(dot)cue(dot)987(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to tell ANALYZE to collect statistics from the whole table?
Date: 2015-01-26 16:51:00
Message-ID: CAMkU=1zzgwgBgb9b4K+yL0jxTuJ1Z4x=BTiRQbse1DdYisViqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Jan 24, 2015 at 9:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> AlexK987 <alex(dot)cue(dot)987(at)gmail(dot)com> writes:
> > The documentation states that "The extent of analysis can be controlled
> by
> > adjusting the default_statistics_target configuration variable". It looks
> > like I can tell Postgres to create more histograms with more bins, and
> more
> > distinct values. This implicitly means that Postgres will use a larger
> > random subset to calculate statistics.
>
> > However, this is not what I want. My data may be quite skewed, and I want
> > full control over the size of the sample. I want to explicitly tell
> Postgres
> > to analyze the whole table. How can I accomplish that?
>
> You can't, and you wouldn't want to if you could, because that would
> result in slurping the entire table into backend local memory. All
> the rows constituting the "random sample" are held in memory while
> doing the statistical calculations.
>
> In practice, the only stat that would be materially improved by taking
> enormously large samples would be the number-of-distinct-values estimate.
> There's already a way you can override ANALYZE's estimate of that number
> if you need to.
>

The accuracy of the list of most common values could also be improved a lot
by increasing the sample.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2015-01-26 20:31:45 Re: Why is PostgreSQL not using my index?
Previous Message Christian Roche 2015-01-26 16:32:22 Why is PostgreSQL not using my index?