Re: Hot Standby (v9d)

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Hannu Krosing <hannu(at)krosing(dot)net>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, jd(at)commandprompt(dot)com, Gregory Stark <stark(at)enterprisedb(dot)com>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby (v9d)
Date: 2009-02-03 17:18:45
Message-ID: 1233681525.4500.185.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2009-02-03 at 18:09 +0200, Hannu Krosing wrote:
> On Tue, 2009-02-03 at 14:28 +0000, Simon Riggs wrote:
> > On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote:
> > >
> > > Hannu Krosing wrote:
> > > > Actually we came up with a solution to this - use filesystem level
> > > > snapshots (like LVM2+XFS or ZFS), and redirect backends with
> > > > long-running queries to use fs snapshot mounted to a different
> > > > mountpoint.
> > > >
> > > > I don't think Simon has yet put full support for it in code, but it is
> > > > clearly _the_ solution for those who want to eat the cake and have it
> > > > too.
> >
> > > How does that work if you're using mutiple file systems via tablespaces
> > > (e.g. indexes in a different TS)?
> >
> > It's a great idea and easy to do, but I can't do everything in one go.
> >
> > The main reasons not to are multiple file system difficulties and lack
> > of a mainstream Linux solution, and more simply lack of time and test
> > resources.
>
> More general, but also lot harder, solution would be going back to roots
> and implement what original postgres 4.2 and earlier versions were meant
> to do - namely VACUUM was not meant to just discard older versions , but
> rather move it to WORM storage (write once read many was all the rage
> back then :) .
>
> If we did that in a way that each relation, at least on warm standby ,
> has its own "archive" fork, possibly in a separate tablespace for
> cheaper storage, then we could basically apply WAL's as fast we want and
> just move the old versions to "archive". It will be slower(ish),
> especially for HOT updates, but may be a good solution for lots of
> usecases.
>
> And the decision to do the archiving on master and WAL-copy to slave, or
> just do it on slave only could probably be left to user.
>
> Reintroducing keeping old tuples "forever" would also allow us to bring
> back time travel feature, that is
>
> SELECT .... AS OF 'yesterday afternoon'::timestamp;
>
> Which was thrown out at the times we got WAL-logging.
>
> To be really useful we should also have some way to know trx timestamps,
> but that can be easily done using ticker feature from Slony -
> SkyTools/pgQ, which could be run a a separate server thread similar to
> what we do with background writer, autovacuum etc. now.

That might be the way to do the "Named Restore Points" that is
frequently requested.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-02-03 17:34:31 Re: add_path optimization
Previous Message David E. Wheeler 2009-02-03 16:42:30 Re: LIMIT NULL