Re: Transaction Snapshots and Hot Standby

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Richard Huxton <dev(at)archonet(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshots and Hot Standby
Date: 2008-09-12 11:41:42
Message-ID: 1221219702.7026.41.camel@huvostro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2008-09-12 at 12:31 +0300, Hannu Krosing wrote:
> On Fri, 2008-09-12 at 09:45 +0100, Simon Riggs wrote:
> > On Thu, 2008-09-11 at 15:42 +0300, Heikki Linnakangas wrote:
> > > Gregory Stark wrote:
> > > > b) vacuum on the server which cleans up a tuple the slave has in scope has to
> > > > block WAL reply on the slave (which I suppose defeats the purpose of having
> > > > a live standby for users concerned more with fail-over latency).
> > >
> > > One problem with this, BTW, is that if there's a continuous stream of
> > > medium-length transaction in the slave, each new snapshot taken will
> > > prevent progress in the WAL replay, so the WAL replay will advance in
> > > "baby steps", and can fall behind indefinitely. As soon as there's a
> > > moment that there's no active snapshot, it can catch up, but if the
> > > slave is seriously busy, that might never happen.
> >
> > It should be possible to do mixed mode.
> >
> > Stall WAL apply for up to X seconds, then cancel queries. Some people
> > may want X=0 or low, others might find X = very high acceptable (Merlin
> > et al).
>
> Or even milder version.
>
> * Stall WAL apply for up to X seconds,
> * then stall new queries, let old ones run to completion (with optional
> fallback to canceling after Y sec),
> * apply WAL.
> * Repeat.

Now that I have thought a little more about delegating keeping old
versions to filesystem level (ZFS , XFS+LVM) snapshots I'd like to
propose the following:

0. run queries and apply WAL freely until WAL application would
remove old rows.

1. stall applying WAL for up to N seconds

2. stall starting new queries for up to M seconds

3. if some backends are still running long queries, then

3.1. make filesystem level snapshot (FS snapshot),
3.2. mount the FS snapshot somewhere (maybe as data.at.OldestXmin
in parallel to $PGDATA) and
3.3 hand this mounted FS snapshot over to those backends

4. apply WAL

5. GoTo 0.

Of course we need to do the filesystem level snapshots in 3. only if the
long-running queries don't already have one given to them. Or maybe also
if they are running in READ COMMITTED mode and and have aquired a new PG
snapshot since they got their FS snapshot need a new one.

Also, snapshots need to be reference counted, so we can unmount and
destroy them once all their users have finished.

I think that enabling long-running queries this way is both low-hanging
fruit (or at least medium-height-hanging ;) ) and also consistent to
PostgreSQL philosophy of not replication effort. As an example we trust
OS's file system cache and don't try to write our own.

----------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Csaba Nagy 2008-09-12 11:44:36 Re: Transaction Snapshots and Hot Standby
Previous Message Richard Huxton 2008-09-12 11:31:58 Re: Transaction Snapshots and Hot Standby