Skip site navigation (1) Skip section navigation (2)

Re: Shared pg_xlog directory/partition and warm standby

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: "Devrim GUNDUZ" <devrim(at)CommandPrompt(dot)com>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Shared pg_xlog directory/partition and warm standby
Date: 2006-11-27 16:35:30
Message-ID: 1164645330.3778.200.camel@silverbirch.site (view raw or flat)
Thread:
Lists: pgsql-hackers
On Mon, 2006-11-27 at 14:17 +0100, Florian G. Pflug wrote:
> Devrim GUNDUZ wrote:
> > Is there anything that may prevent two PostgreSQL servers to share the
> > same pg_xlog directory; while one is using read-only and the other one
> > is using the same partition for read and write? The problem is: If we
> > share the same pg_xlog between production server and warm standby
> > server; can you see any possibility of data/xlog corruption? Of course,
> > warm standby server will mount that partition as read-only.
> 
> What happens in the standby server falls so far behind the master that
> the xlogs it wants to read are already being overwritten?
> 
> AFAIK the files in pg_xlog form a circular buffer, and are reused after 
> a while...

If the archive_command doesn't actually do anything, just leaves them
there, the files will automatically get moved to .done state and will
then get removed within 2 checkpoints. So it will work as long as your
standby keeps up with the primary. If it falls behind, you'll lose the
file and you'll be out of luck (no file, start from base backup again).
A large checkpoint_segments would help, but no way to avoid that
situation.

The archiver assumes that you want to archive things oldest first, so if
the archive_command fails it will retry on that file repeatedly. Put it
another way the archiving is synchronous: when an archive is requested
we wait for the answer before attempting the next. 

I suppose we might want to have multiple archivals occurring
simultaneously by overlapping their start and stop times. That might be
useful for situations where we have a bank of slow response tape
drives/autoloaders?

You'd need to have a second archive command to poll for completion.
Currently archive_status has 2 states: .ready and .done. We could have 3
states: .ready, .inprogress and .done. The first archive_command_start,
if successful would move the state from .ready to .inprogress, while the
second archive_command_confirm would move the state from .inprogress
to .done. (Better names please...)

With an asynchronous API, it would then be possible to fire off requests
to archive lots of files, then return later to confirm their completion.
Or in Devrim's case do nothing apart from wait for them to be applied by
the Standby server.

Anybody else see the need for this?

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com



In response to

Responses

pgsql-hackers by date

Next:From: Simon RiggsDate: 2006-11-27 17:02:28
Subject: Re: Configuring BLCKSZ and XLOGSEGSZ (in 8.3)
Previous:From: Mike RylanderDate: 2006-11-27 16:03:47
Subject: Re: Configuring BLCKSZ and XLOGSEGSZ (in 8.3)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group