Re: How exactly does Analyze work?

From: Kevin Kempter <kevink(at)consistentstate(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: How exactly does Analyze work?
Date: 2009-11-25 15:22:01
Message-ID: 200911250822.01263.kevink@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wednesday 25 November 2009 05:34:26 Richard Neill wrote:
> Dear All,
>
> Thanks very much for your help so far. My understanding of PG is getting
> a lot better!
>
> I wonder if I've understood analyze properly: I'm not sure I quite
> understand how specific the statistics gathered actually are.
>
>
> In particular, what happens in the following case:
> 1. I start with have a table with 100 million rows, and column wid has
> linearly distributed values from 45-90. (wid is indexed)
>
> 2. I run vacuum analyze
>
> 3. I insert about 2 million rows, all of which have the new wid of 91.
>
> 4. I then do a select * WHERE wid = 91.
>
> How smart is analyze? Will it actually say "well, I've never seen 91 in
> this table, because all the values only go up to 90, so you'd better do
> a sequential scan"?
>
>
> -----
>
> On another note, I notice that if I ever manually run vacuum or analyze,
> the performance of the database drops to the point where many of the
> operators get kicked out. Is there any way to run them "nice" ?

increasing maintenance_work_mem to several GB (if you have the memory) will
help

>
> We need to maintain a response time of under 1 second all day for simple
> queries (which usually run in about 22ms). But Vacuum or Analyze seem to
> lock up the system for a few minutes, during which other queries block
> on them, although there is still plenty of CPU spare.
>
> -----
>
>
> Also, I find that, even with the autovacuum daemon running, there was
> one query last night that I had to terminate after an hour. In
> desperation, I restarted postgres, let it take 15 mins to vacuum the
> entire DB, and then re-ran the query (in 8 minutes)
>
> Any ideas how I can troubleshoot this better? The database is only 30GB
> in total - it should (if my intuition is right) be impossible that any
> simple select (even over a modestly complex view) should take longer
> than a multiple of the time required to read all the data from disk?
>
>
>
> Thanks very much,
>
> Richard
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-11-25 15:22:03 Re: How exactly does Analyze work?
Previous Message Robert Haas 2009-11-25 14:40:00 Re: Best possible way to insert and get returned ids