Re: logical changeset generation v6

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical changeset generation v6
Date: 2013-09-24 08:15:08
Message-ID: 20130924081508.GA5684@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-09-23 23:12:53 -0400, Robert Haas wrote:
> What exactly is the purpose of this tool? My impression is that the
> "output" of logical replication is a series of function calls to a
> logical replication plugin, but does that plugin necessarily have to
> produce an output format that gets streamed to a client via a tool
> like this?

There needs to be a client acking the reception of the data in some
form. There's currently two output methods, SQL and walstreamer, but
there easily could be further, it's basically two functions you have
write.

There are several reasons I think the tool is useful, starting with the
fact that it makes the initial use of the feature easier. Writing a
client for CopyBoth messages wrapping 'w' style binary messages, with the
correct select() loop isn't exactly trivial. I also think it's actually
useful in "real" scenarios where you want to ship the data to a
remote system for auditing purposes.

> For example, for replication, I'd think you might want the
> plugin to connect to a remote database and directly shove the data in;

That sounds like a bad idea to me. If you pull the data from the remote
side, you get the data in a streaming fashion and the latency sensitive
part of issuing statements to your local database is done locally.
Doing things synchronously like that also makes it way harder to use
synchronous_commit = off on the remote side, which is a tremendous
efficiency win.

If somebody needs something like this, e.g. because they want to
replicate into hundreds of shards depending on some key or such, the
question I don't know is how to actually initiate the
streaming. Somebody would need to start the logical decoding.

> for materialized views, we might like to push the changes into delta
> relations within the source database.

Yes, that's not a bad usecase and I think the only thing missing to use
output plugins that way is a convenient function to tell up to where
data has been received (aka synced to disk, aka applied).

> In either case, there's no
> particular need for any sort of client at all, and in fact it would be
> much better if none were required. The existence of a tool like
> pg_receivellog seems to presuppose that the goal is spit out logical
> change records as text, but I'm not sure that's actually going to be a
> very common thing to want to do...

It doesn't really rely on anything being text - I've used it with a
binary plugin without problems. Obviously you might not want to use -f -
but an actual file instead...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-24 08:24:34 Re: record identical operator
Previous Message Ronan Dunklau 2013-09-24 06:44:42 Re: Extensions makefiles - coverage