Skip site navigation (1) Skip section navigation (2)

Re: Decreasing WAL size effects

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Jason Long <mailing(dot)list(at)supernovasoftware(dot)com>, pgsql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Decreasing WAL size effects
Date: 2008-10-30 18:52:59
Message-ID: Pine.GSO.4.64.0810301433010.29392@westnet.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
On Thu, 30 Oct 2008, Joshua D. Drake wrote:

>> This reminds me yet again that pg_clearxlogtail should probably get added
>> to the next commitfest for inclusion into 8.4; it's really essential for a
>> WAN-based PITR setup and it would be nice to include it with the
>> distribution.
>
> What is to be gained over just using rsync with -z?

When a new XLOG segment is created, it gets zeroed out first, so that 
there's no chance it can accidentally look like a valid segment.  But when 
an existing segment is recycled, it gets a new header and that's it--the 
rest of the 16MB is still left behind from whatever was in that segment 
before.  That means that even if you only write, say, 1MB of new data to a 
recycled segment before a timeout that causes you to ship it somewhere 
else, there will still be a full 15MB worth of junk from its previous life 
which may or may not be easy to compress.

I just noticed that recently this project has been pushed into pgfoundry, 
it's at 
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/clearxlogtail/clearxlogtail/

What clearxlogtail does is look inside the WAL segment, and it clears the 
"tail" behind the portion of that is really used.  So our example file 
would end up with just the 1MB of useful data, followed by 15MB of zeros 
that will compress massively.  Since it needs to know how XLogPageHeader 
is formatted and if it makes a mistake your archive history will be 
silently corrupted, it's kind of a scary utility to just download and use. 
That's why I'd like to see it turn into a more official contrib module, so 
that it will never lose sync with the page header format and be available 
to anyone using PITR.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

pgsql-hackers by date

Next:From: Jason LongDate: 2008-10-30 19:07:40
Subject: Re: Decreasing WAL size effects
Previous:From: Martin PihlakDate: 2008-10-30 18:44:56
Subject: Re: contrib/pg_stat_statements

pgsql-general by date

Next:From: Jason LongDate: 2008-10-30 19:07:40
Subject: Re: Decreasing WAL size effects
Previous:From: Alan HodgsonDate: 2008-10-30 18:39:31
Subject: Re: speed up restore from dump

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group