Skip site navigation (1) Skip section navigation (2)

Re: Hot Standby (v9d)

From: Hannu Krosing <hannu(at)krosing(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, jd(at)commandprompt(dot)com, Gregory Stark <stark(at)enterprisedb(dot)com>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby (v9d)
Date: 2009-02-03 16:09:27
Message-ID: 1233677367.7999.33.camel@huvostro (view raw or flat)
Thread:
Lists: pgsql-hackers
On Tue, 2009-02-03 at 14:28 +0000, Simon Riggs wrote:
> On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
> > 
> > Hannu Krosing wrote:
> > > Actually we came up with a solution to this - use filesystem level
> > > snapshots (like LVM2+XFS or ZFS), and redirect backends with
> > > long-running queries to use fs snapshot mounted to a different
> > > mountpoint.
> > >
> > > I don't think Simon has yet put full support for it in code, but it is
> > > clearly _the_ solution for those who want to eat the cake and have it
> > > too.
> 
> > How does that work if you're using mutiple file systems via tablespaces 
> > (e.g. indexes in a different TS)?
> 
> It's a great idea and easy to do, but I can't do everything in one go.
> 
> The main reasons not to are multiple file system difficulties and lack
> of a mainstream Linux solution, and more simply lack of time and test
> resources.

More general, but also lot harder, solution would be going back to roots
and implement what original postgres 4.2 and earlier versions were meant
to do - namely VACUUM was not meant to just discard older versions , but
rather move it to WORM storage (write once read many was all the rage
back then :) .

If we did that in a way that each relation, at least on warm standby ,
has its own "archive" fork, possibly in a separate tablespace for
cheaper storage, then we could basically apply WAL's as fast we want and
just move the old versions to "archive". It will be slower(ish),
especially for HOT updates, but may be a good solution for lots of
usecases.

And the decision to do the archiving on master and WAL-copy to slave, or
just do it on slave only could probably be left to user.

Reintroducing keeping old tuples "forever" would also allow us to bring
back time travel feature, that is 

SELECT .... AS OF 'yesterday afternoon'::timestamp;

Which was thrown out at the times we got WAL-logging.

To be really useful we should also have some way to know trx timestamps,
but that can be easily done using ticker feature from Slony -
SkyTools/pgQ, which could be run a a separate server thread similar to
what we do with background writer, autovacuum  etc. now.

> 
> So not now, maybe later.
> 


In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2009-02-03 16:21:26
Subject: Re: add_path optimization
Previous:From: Robert HaasDate: 2009-02-03 15:54:14
Subject: Re: add_path optimization

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group