Skip site navigation (1) Skip section navigation (2)

Re: Command to prune archive at restartpoints

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command to prune archive at restartpoints
Date: 2010-03-17 10:14:40
Message-ID: 4BA0AB90.5060208@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Greg Stark wrote:
> On Wed, Mar 17, 2010 at 9:37 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> One awkward omission in the new built-in standby mode, mainly used for
>> streaming replication, is that there is no easy way to delete old
>> archived files like you do with the %r parameter to restore_command.
> 
> I'm still finding this kind of narrow-minded. I'm picturing a system
> with multiple replicas -- obvious no one replica can take it upon
> itself to delete archived log files based only on its own
> restartpoint. And besides, if you're using the archived log files for
> backups you also need to take into account the backup policy and only
> delete files that aren't needed for a consistent backup and aren't
> needed for the replica.

That's why we provide options that take any shell command you want,
rather than e.g a path to an archive directory that's pruned automatically.

For example, if you have multiple standbys sharing one archive, you
could do something like this:

In each standby, have a restartpoint_command along the lines of:
"echo %r > <archivedirectory>/standby1_location; archive_cleanup.sh"

Where '1' is different for every standby

and in archive_cleanup.sh, scan through all the standbyX_location files,
take the minimum, and delete all files smaller than that.

You'll need some care with locking etc., but the point is that the
current hooks allow you to implement complex setups like that.

> What we need is a program which can take all this information from all
> your slaves and backup labels into account and implement your backup
> policies. It probably won't exist in time for the release and in any
> case doesn't really have to ship with Postgres. There might even be
> more than one.

I guess I just described such a program :-). Yeah, I'd imagine that to
become part of toolkits like skytools.

> But do we have all the information that such a program would need? Is
> there a way to connect to a replica and ask it what the restart point
> is?

Hmm, Greg Smith opened a thread on exposing the fields in the control
file as user-defined functions. IIRC last restartpoint location was the
piece of information that triggered the discussion this time. Perhaps we
should indeed add a function to expose that in 9.0.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

In response to

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2010-03-17 10:35:37
Subject: Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Previous:From: Greg StarkDate: 2010-03-17 10:01:44
Subject: Re: Command to prune archive at restartpoints

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group