Skip site navigation (1) Skip section navigation (2)

Re: VACUUM ANALYZE is faster than ANALYZE?

From: Cédric Villemain <cedric(at)2ndquadrant(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
Subject: Re: VACUUM ANALYZE is faster than ANALYZE?
Date: 2012-02-25 20:03:18
Message-ID: 201202252103.19158.cedric@2ndquadrant.fr (view raw or flat)
Thread:
Lists: pgsql-hackers
Le mercredi 22 février 2012 20:12:35, Pavel Stehule a écrit :
> 2012/2/22 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> >> usual pattern in our application is
> >> 
> >> create table xx1 as select ....
> >> analyze xx1
> >> create table xx2 as select .... from xx1, ....
> >> analyze xx2
> >> create table xx3 as select ... from xx3, ....
> >> analyze xx3
> >> create table xx4 as select ... from xx1, ...
> >> 
> >> tables xx** are use as cache.
> >> 
> >> so we have to refresh statistic early.
> >> 
> >> in this situation - and I found so in this case VACUUM ANALYZE is
> >> faster (30%) than ANALYZE. Size of xx** is usually between 500Kb
> >> and 8Kb
> >> 
> >> This is not usual pattern for OLTP - Application is strictly OLAP.
> > 
> > Is the VACUUM ANALYZE step faster, or is the overall job faster if
> > VACUUM ANALYZE is run?  You may be running into the need to rewrite
> > pages at an inopportune time or order without the VACUUM.  Have you
> > tried getting a time VACUUM FREEZE ANALYZE on these cache tables
> > instead of plain VACUUM ANALYZE?
> > 
> > -Kevin
> 
> vacuum freeze analyze is slower as expected. vacuum analyze is little
> bit faster or same in any step then analyze.
> 
> I expected so just analyze should be significantly faster and it is not.
> 
> Tom's demonstration is enough for me. ANALYZE doesn't read complete
> table, but uses random IO. VACUUM ANALYZE reads complete table, but it
> uses seq IO and vacuum is fast (because it does nothing) in our case.

VACUUM does read the 1st block to be sure readahead is done when ANALYSE does 
not.
For ANALYZE, maybe it is interesting to issue a read on the first block or use 
POSIX_FADVISE to (try) to force a readahead of the table when it is small 
enough (so ANALYSE can start working while blocks are read and put in cache).

That's being said, I am surprised that the pattern "create table...analyze 
create table analyze" of such smalls ones make the data being flush from OS 
cache so quickly that they need to be read again from disk.
Pavel, can you check the cache status of the tables just before the analyze ? 
(you can use OS tools or pgfincore extension for that)

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

In response to

pgsql-hackers by date

Next:From: Cédric VillemainDate: 2012-02-25 20:13:35
Subject: Re: WIP: URI connection string support for libpq
Previous:From: Cédric VillemainDate: 2012-02-25 19:37:01
Subject: Re: WIP: URI connection string support for libpq

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group