Re: GDQ iimplementation

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-cluster-hackers(at)postgresql(dot)org
Subject: Re: GDQ iimplementation
Date: 2010-05-11 14:38:35
Message-ID: 4BE96BEB.8040701@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

On 5/11/2010 9:19 AM, Simon Riggs wrote:
> On Tue, 2010-05-11 at 08:33 -0400, Jan Wieck wrote:
>
>> What are the advantages of anything proposed over the current
>> implementations used by Londiste and Slony?
>
> It would be good to have a core technology that provided a generic
> transport to other remote databases.
>
> We already have WALSender and WALReceiver, which uses the COPY protocol
> as a transport mechanism. It would be easy to extend that so we could
> send other forms of data.
>
> We can do that in two ways:
>
> * Alter triggers so that Slony/Londiste write directly to WAL rather
> than log tables, using a new WAL record for custom data blobs.

Londiste and Slony "consume" the log data in a different order than it
appears in the WAL. Using WAL would mean moving a lot of complexity,
that is currently done by using an MVCC style grouping from the log
origin to the log consumers.

> * Alter WALSender so it can read Slony/Londiste log tables for
> consumption by an architecture similar to WALreceiver/Startup. Probably
> easier.

Only if that altercation means also to be able to

1) hand WALSender the from and to snapshots
2) WALSender is able to send the UNION of multiple log tables ordered
by the event/action ID

Because that is how both, Londiste and Slony, are consuming the log.

> We can also alter the WAL format itself to include the information in
> WAL that is required to do what Slony/Londiste already do, so we don't
> need to specifically write anything at all, just read WAL at other end.
> Even more efficient.
>
> The advantages of these options would be
>
> * integration of core technologies
> * greater efficiency for trigger based logging via WAL

I'm still unclear how we can ensure cross version functionality when
using such core technology. Are you implying that a 9.3 WALReceiver will
always be able to consume the data format sent by a 9.1 WALSender?

> In other RDBMS "replication" has long meant "data transport, either for
> HA or application use". We should be looking beyond the pure HA aspects,
> as pgq does.

Slony replication has meant both too from the beginning.

> I would certainly like to see a system that wrote data on master and
> then constructed the SQL on receiver-side (i.e. on slave), so the
> integration was less tight. That would allow data to be sent and for it
> to be consumed to a variety of purposes, not just HA replay.

Slony does exactly that constructing of SQL on the receiver side, and it
is a big drawback because every single row update needs to go through a
separate SQL query that is parsed, planned and optimized. I can envision
a generic function that takes the data format, recorded by the capture
trigger on the master, and turns that into a simple plan. All these
single row updates/deletes are PK based, no need to even think about
parsing and planning that over and over. Just replace the targetlist to
reflect whatever this log row updates and execute it. These will always
be a literal value from the log or the OLD value on fields untouched.
Simple enough.

The big advantage from such generic support would be that systems like
Londiste/Slony could use the existing COPY-SELECT mechanism to transport
the log in a streaming protocol, while a BEFORE INSERT trigger on the
receivers log segments is turning it into highly efficient single row
operations.

This generic single row change capture and single row update support
would allow Londiste/Slony type replication systems to eliminate most
round trip based latency, a lot of CPU usage on the replicas plus all
the libpq and SQL query assembly in the replication engine itself.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

In response to

Responses

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-05-11 15:11:03 Re: GDQ iimplementation
Previous Message Simon Riggs 2010-05-11 14:26:39 Re: GDQ iimplementation (was: Re: Clustering features for upcoming developer meeting -- please claim yours!)