Re: VACUUM ANALYZE is faster than ANALYZE?

From: Cédric Villemain <cedric(at)2ndquadrant(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
Subject: Re: VACUUM ANALYZE is faster than ANALYZE?
Date: 2012-02-25 20:03:18
Message-ID: 201202252103.19158.cedric@2ndquadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le mercredi 22 février 2012 20:12:35, Pavel Stehule a écrit :
> 2012/2/22 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> >> usual pattern in our application is
> >>
> >> create table xx1 as select ....
> >> analyze xx1
> >> create table xx2 as select .... from xx1, ....
> >> analyze xx2
> >> create table xx3 as select ... from xx3, ....
> >> analyze xx3
> >> create table xx4 as select ... from xx1, ...
> >>
> >> tables xx** are use as cache.
> >>
> >> so we have to refresh statistic early.
> >>
> >> in this situation - and I found so in this case VACUUM ANALYZE is
> >> faster (30%) than ANALYZE. Size of xx** is usually between 500Kb
> >> and 8Kb
> >>
> >> This is not usual pattern for OLTP - Application is strictly OLAP.
> >
> > Is the VACUUM ANALYZE step faster, or is the overall job faster if
> > VACUUM ANALYZE is run? You may be running into the need to rewrite
> > pages at an inopportune time or order without the VACUUM. Have you
> > tried getting a time VACUUM FREEZE ANALYZE on these cache tables
> > instead of plain VACUUM ANALYZE?
> >
> > -Kevin
>
> vacuum freeze analyze is slower as expected. vacuum analyze is little
> bit faster or same in any step then analyze.
>
> I expected so just analyze should be significantly faster and it is not.
>
> Tom's demonstration is enough for me. ANALYZE doesn't read complete
> table, but uses random IO. VACUUM ANALYZE reads complete table, but it
> uses seq IO and vacuum is fast (because it does nothing) in our case.

VACUUM does read the 1st block to be sure readahead is done when ANALYSE does
not.
For ANALYZE, maybe it is interesting to issue a read on the first block or use
POSIX_FADVISE to (try) to force a readahead of the table when it is small
enough (so ANALYSE can start working while blocks are read and put in cache).

That's being said, I am surprised that the pattern "create table...analyze
create table analyze" of such smalls ones make the data being flush from OS
cache so quickly that they need to be read again from disk.
Pavel, can you check the cache status of the tables just before the analyze ?
(you can use OS tools or pgfincore extension for that)

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2012-02-25 20:13:35 Re: WIP: URI connection string support for libpq
Previous Message Cédric Villemain 2012-02-25 19:37:01 Re: WIP: URI connection string support for libpq