Re: Transaction Snapshots and Hot Standby

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Transaction Snapshots and Hot Standby
Date: 2008-09-11 08:38:21
Message-ID: 1221122301.7225.17.camel@huvostro
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2008-09-11 at 09:24 +0300, Heikki Linnakangas wrote:

> I like the idea of acquiring snapshots locally in the slave much more.
> As you mentioned, the options there are to defer applying WAL, or cancel
> queries.

More exotic ways to defer applying WAL include using some smart
filesystems to get per-backend data snapshots, using either
copy-of-write overlay filesystems and filesystem or disk level
snapshots.

Al least the disk level snapshots exist in SAN-s with aim of easing
backups, though I'm not sure if it is effective for use hot standby
intended use.

Using any of those needs detecting and bypassing shared buffers if they
hold "too new" data pages and reading these pages directly from disk
snapshot.

> I think both options need the same ability to detect when
> you're about to remove a tuple that's still visible to some snapshot,
> just the action is different. We should probably provide a GUC to
> control which you want.

We probably need to have two LSN's per page to make maximal use of our
MVCC in Hot Standby situation, so we can distinguish addition to a page,
which implies no data loss from row removal which does. Currently only
Vacuum and Hot pruning can cause row removal.

> However, if we still to provide the behavior that "as long as the
> network connection works, the master will not remove tuples still needed
> in the slave" as an option, a lot simpler implementation is to
> periodically send the slave's oldest xmin to master. Master can take
> that into account when calculating its own oldest xmin. That requires a
> lot less communication than the proposed scheme to send snapshots back
> and forth. A softer version of that is also possible, where the master
> obeys the slave's oldest xmin, but only up to a point.

That point could be statement_timeout or (currently missing)
transaction_timeout

Also, decision to advance xmin should probably be sent to slave as well,
even though it is not something that is needed in local WAL logs.

--------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abbas 2008-09-11 09:26:05 Postgresql coding conventions
Previous Message Heikki Linnakangas 2008-09-11 07:20:50 Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker)