Skip site navigation (1) Skip section navigation (2)

Re: VACUUM FREEZE output more than double input

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: VACUUM FREEZE output more than double input
Date: 2007-12-17 15:45:43
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
>>> On Fri, Dec 14, 2007 at  5:59 PM, in message <15401(dot)1197676791(at)sss(dot)pgh(dot)pa(dot)us>,
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote: 
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Why double writes per read, plus massive writes at checkpoint?
> The double writes aren't surprising: freezing has to be WAL-logged,
> and the odds are that each page hasn't been touched since the last
> checkpoint, so the WAL log will include a complete page image.
> So in the steady state where all shared buffers are dirty, the
> per-page cycle is:
> 	* write out a dirty buffer so it can be reclaimed
> 	* read in a page
> 	* modify it to mark tuples frozen
> 	* write an image of the page to WAL
> 	* leave the now-dirty page in shared buffers for later writing
> The checkpoint spikes would come from trying to flush out all the
> dirty buffers at once.
Got it.  Thanks.
> You'd expect a bit of a valley after each peak, since the vacuum
> could presumably recycle some buffers without having to flush 'em
> first; but I don't see one in your data.  That may just be because
> the numbers are too noisy, but I kinda suspect that the vacuum is
> dirtying buffers nearly as fast as the bgwriter can clean them,
> leaving not a lot of daylight for a valley.
Yeah, the pattern was pretty consistent and without valleys.
> 8.3 should pretty well eliminate the checkpoint spike in this scenario,
> because vacuum will work in a limited number of shared buffers instead
> of dirtying the whole cache.  But you'll still see 2X writes over reads.
Testing 8.3beta4 so far has shown both smoother I/O and better
performance in all respects.  The preliminary post I did where I
thought I saw some regression on loading a pg_dump turned out to be
was an "apples to oranges" comparison; comparing the same load on
the same hardware and OS, 8.3 wins.  (Kudos to all who worked on
these improvements!)
> If this is data that you could re-generate at need, it might make sense
> to turn off full_page_writes during the initial data load and vacuum.
Thanks for the suggestions; I'll try that.
> I concur with trying to FREEZE all the data while you do this, else
> you'll see the same work done whenever the data happens to slip past
> the auto freeze threshold.
Thanks.  I thought that made sense, but I'm still trying to get my
head around some of the dynamics of PostgreSQL and MVCC.  I'll
suggest that as policy here.

In response to

pgsql-performance by date

Next:From: Joshua D. DrakeDate: 2007-12-17 16:13:43
Subject: Re: viewing source code
Previous:From: Merlin MoncureDate: 2007-12-17 14:13:52
Subject: Re: viewing source code

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group