Re: Synchronous replication, network protocol

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)enterprisedb(dot)com>
Subject: Re: Synchronous replication, network protocol
Date: 2008-12-30 13:54:43
Message-ID: 1230645283.4793.1284.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 2008-12-30 at 14:40 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> >
> > That for me is beginning to sound fairly ugly: difficult to understand
> > and difficult to use. But I see some people might want that in certain
> > circumstances. So I guess we should build it. Any good ideas for the
> > control mechanism?
>
> Using functions seems overly complicated.

We agree on that.

> Since xids are system-wide, I
> don't see much value in specifying them at any finer level, or in
> allowing them for non-superusers. GUC seems like the natural choice.

Well, GUCs have security implications that I'm not happy about. I will
relent if you will vouch for that decision.

"standby_xmin_on_primary"

(boolean) - a USERSET GUC that only has meaning during standby query
execution. <name> specifies whether the current standby session's xmin
is included in the calculation of OldestXmin on the *primary* node. If
this parameter is true then the standby query will never be cancelled
because of conflicts between the activity of the primary and standby
(see discussion in chapter XXXX). The downside of using this parameter
is that standby queries can cause table bloat on the primary (see
chapter Data Maintenance for more detail).

"standby_xmin_on_primary" - new name sought. I think it should begin
with "standby_" to remind us that it only effects standby query
processing.

Implementation:

WALReceiver will send message back to WALSender. WALSender will update a
single 4 byte value, RemoteXmin that is read during GetSnapshotData().
Updating value will not hold a lock, just as xid is not locked when
setting new value.

We add a boolean to each proc: SendRemoteXmin. When we run
GetSnapshotData() if our own proc has SendRemoteXmin set then we
calculate RemoteXmin from the minimum of any proc with SendRemoteXmin
set. When we release our snapshot we re-calculate RemoteXmin so that the
primary node suffers as little delay as possible in receiving updates to
xmin.

I'll begin work on this once sync rep is committed. It's about 3-5 days
work, but no point in writing it yet because the sand will shift
underneath it too much in the next few weeks.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-12-30 15:05:59 LP_DELETE
Previous Message Heikki Linnakangas 2008-12-30 12:40:46 Re: Synchronous replication, network protocol