Re: Deleting old archived WAL files

From: Chander Ganesan <chander(at)otg-nc(dot)com>
To: Jaume Sabater <jsabater(at)linuxsilo(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Deleting old archived WAL files
Date: 2008-10-30 20:17:44
Message-ID: 490A1668.6000107@otg-nc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Jaume Sabater wrote:
> Chander Ganesan wrote:
>
>
>> If you are running warm-standby, its presumable that your standby
>> server is "consuming" these files as they are being generated. In
>> such a case, you can set "log_restartpoints" in your recovery.conf
>> file, and use pg_standby with the '%r' (restartwalfile) parameter so
>> that it can "prune" old WAL files when necessary. In such a case,
>> you wouldn't need to do any pruning yourself, since pg_standby would
>> do it for you, when the standby server indicates that it is "safe" to
>> remove those old files.
>>
>> log_restartpoints='true' restore_command='pg_standby /archive_dir %f
>> %p %r'
>>
>> In short, your system that is in "recovery mode" can decide which
>> ones it needs to get rid of, once it knows it no longer needs them.
>>
>
> Thanks for your answer. I believe I did not explain myself correctly.
> pgpool-II, a middleware, is replicating all transactions into all the
> nodes (via sending the SQL statements to all nodes). And does load
> balancing of SELECT statements when possible.
>
Ahh...typically the term "warm standby" refers to a secondary server
that is not "hot" (i.e., queryable). I assumed that you meant you were
using pgpool-II for pooing and as a "switch" to move things over to the
other server.
> Therefore, the online recovery process can happen on what could be
> called the master node (the slave node failed) or in the slave node (the
> master node failed). In both cases the same happens:
>
> 1. pg_start_backup
> 2. rsync data dir
> 3. pg_stop_backup
> 4. Recovery process, scp'ing the necessary archived WAL files from the
> existing node to the being-recovered node.
>
So really, when you need to "recover", you can simply remove all the old
"wals" prior to issuing your pg_start_backup(), since they would not be
needed after your rsync.
> So, what I mean is that it's not a pgsql to pgsql warm standby setup
> (both pgsql nodes don't know about each other, and one is not feeding
> the other with WAL files so that the other can be kept updated).
>
Okay...
> In other words, let's imagine it's just a single pgsql installation,
> with no pgpool-II stuff, and I am archiving the WAL files because I
> fancy it, or as an backup system.
>
> But I don't want to keep archived WAL files forever, but just up to,
> let's say, a few gigabytes or something. Therefore I thought I would
> have to be doing pg_start_backup/pg_stop_backup "manually" and getting
> rid of the now old files myself, as explained in my previous mail.
>
Why? If you haven't saved a backup that you made previously (using
pg_start_backup()), having archived wal files is useless. I don't see
why you just don't arbitrarily remove them immediately.

WAL files are useful when you have a "starting point" (i.e., a backup
made using the PITR backup method), and a set of WAL files generated
"after" the starting point. If you don't preserve the start point (i.e.,
the PITR backup) then the WAL files are basically useless.

If you wanted to actually use them for something, you would issue a
pg_start_backup(), make a backup, and then preserve that backup (in case
you need to recover using the backup and subsequent WAL's at a later
date). In such a case, you could remove the old WAL's when you
discarded the old backup (hopefully, after making a new one).

I think what you want to do is something like this:

1. have an archive_command that looks something like this (we'll call
this 'archive.sh'):
if [ -f /var/lib/postgresql/pg_xlog_archive/regen_secondary.lock ]; then
/bin/cp $1 /var/lib/postgresql/pg_xlog_archive/$2
exit $?
fi
exit 0

Note: If you are using 8.3, you can even just disable the
archive_command when you aren't doing a recovery...

2. When you are rebuilding the "secondary", issue:
touch /var/lib/postgresql/pg_xlog_archive/regen_secondary.lock

3. Set your archive command to "archive_command='archive.sh %p %f'"
4. Perform a warm_standby backup of the node (pg_start_backup(), etc.);
5. When you complete recovery (and have the second node up and running)
go ahead and issue:
rm -f /var/lib/postgresql/pg_xlog_archive/*

This way you'll only "archive" WAL files when you actually need them,
and won't waste time/space copying WAL files that you'll never use anyways.

I'm not sure that I agree with your strategy in terms of preserving your
data in the case of some sort of failure, but if I understand your
problem correctly, this would serve as a pretty good solution...

As a side note, I'd recommend you read up on the standard issues that
apply when using PgPool-II in multi-master replication mode with query
balancing (if you aren't already aware of them) :-) ...

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC 27560
919-463-0999/877-258-8987
http://www.otg-nc.com
Expert PostgreSQL training - from basic to advanced, delivered worldwide!

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Simon Riggs 2008-10-30 21:26:28 Re: Deleting old archived WAL files
Previous Message Jaume Sabater 2008-10-30 17:30:45 Re: Deleting old archived WAL files