Skip site navigation (1) Skip section navigation (2)

Re: Thoughts on statistics for continuously advancing columns

From: Greg Stark <gsstark(at)mit(dot)edu>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Nathan Boley <npboley(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Thoughts on statistics for continuously advancing columns
Date: 2009-12-30 18:42:38
Message-ID: 407d949e0912301042w3e1c14b6s5247587c5fc07ba5@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Wed, Dec 30, 2009 at 4:31 PM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
> On the other hand ANALYZE also:
>
> 1. Uses lots of memory
> 2. Lots of processor
> 3. Can take a long time
>
> We normally don't notice because most sets won't incur a penalty. We got a
> customer who
> has a single table that is over 1TB in size... We notice. Granted that is
> the extreme
> but it would only take a quarter of that size (which is common) to start
> seeing issues.

I'm a bit puzzled by people's repeated suggestion here that large
tables take a long time to analyze. The sample analyze takes to
generate statistics is not heavily influenced by the size of the
table. Your 1TB table should take basically the same amount of time as
a 1GB table or a 1MB table (if it wasn't already in cache).

Unless the reason why it's 1TB is that the columns are extremely wide
rather than that it has a lot of rows? Or unless you've raised the
statistics target in (a misguided*) belief that larger tables require
larger statistics targets to achieve the same level of accuracy. Or
unless when you say "ANALYZE" you're really running "VACUUM ANALYZE".

[*] except for ndistinct estimates :(

-- 
greg

In response to

Responses

pgsql-hackers by date

Next:From: Simon RiggsDate: 2009-12-30 18:56:31
Subject: Re: Cancelling idle in transaction state
Previous:From: Tom LaneDate: 2009-12-30 18:24:16
Subject: Re: Thoughts on statistics for continuously advancing columns

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group