Skip site navigation (1) Skip section navigation (2)

Re: VACUUM FREEZE output more than double input

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: VACUUM FREEZE output more than double input
Date: 2007-12-17 15:45:43
Message-ID: 47664546.EE98.0025.0@wicourts.gov (view raw or flat)
Thread:
Lists: pgsql-performance
>>> On Fri, Dec 14, 2007 at  5:59 PM, in message <15401(dot)1197676791(at)sss(dot)pgh(dot)pa(dot)us>,
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote: 
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Why double writes per read, plus massive writes at checkpoint?
> 
> The double writes aren't surprising: freezing has to be WAL-logged,
> and the odds are that each page hasn't been touched since the last
> checkpoint, so the WAL log will include a complete page image.
> So in the steady state where all shared buffers are dirty, the
> per-page cycle is:
> 	* write out a dirty buffer so it can be reclaimed
> 	* read in a page
> 	* modify it to mark tuples frozen
> 	* write an image of the page to WAL
> 	* leave the now-dirty page in shared buffers for later writing
> 
> The checkpoint spikes would come from trying to flush out all the
> dirty buffers at once.
 
Got it.  Thanks.
 
> You'd expect a bit of a valley after each peak, since the vacuum
> could presumably recycle some buffers without having to flush 'em
> first; but I don't see one in your data.  That may just be because
> the numbers are too noisy, but I kinda suspect that the vacuum is
> dirtying buffers nearly as fast as the bgwriter can clean them,
> leaving not a lot of daylight for a valley.
 
Yeah, the pattern was pretty consistent and without valleys.
 
> 8.3 should pretty well eliminate the checkpoint spike in this scenario,
> because vacuum will work in a limited number of shared buffers instead
> of dirtying the whole cache.  But you'll still see 2X writes over reads.
 
Testing 8.3beta4 so far has shown both smoother I/O and better
performance in all respects.  The preliminary post I did where I
thought I saw some regression on loading a pg_dump turned out to be
was an "apples to oranges" comparison; comparing the same load on
the same hardware and OS, 8.3 wins.  (Kudos to all who worked on
these improvements!)
 
> If this is data that you could re-generate at need, it might make sense
> to turn off full_page_writes during the initial data load and vacuum.
 
Thanks for the suggestions; I'll try that.
 
> I concur with trying to FREEZE all the data while you do this, else
> you'll see the same work done whenever the data happens to slip past
> the auto freeze threshold.
 
Thanks.  I thought that made sense, but I'm still trying to get my
head around some of the dynamics of PostgreSQL and MVCC.  I'll
suggest that as policy here.
 
-Kevin
 


In response to

pgsql-performance by date

Next:From: Joshua D. DrakeDate: 2007-12-17 16:13:43
Subject: Re: viewing source code
Previous:From: Merlin MoncureDate: 2007-12-17 14:13:52
Subject: Re: viewing source code

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group