Re: Changeset Extraction Interfaces

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changeset Extraction Interfaces
Date: 2013-12-12 03:29:46
Message-ID: CA+TgmobDi9TQRY_S_Ub2Z-83K103=myqmHVcVqj1gQKWTEa7rQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 4, 2013 at 7:15 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> There's basically three major 'verbs' that can be performed on a
> stream, currently named (walsender names):
> * INIT_LOGICAL_REPLICATION "name" "output_plugin"
> * START_LOGICAL_REPLICATION "name" last_received ("option_name" value,...)
> * FREE_LOGICAL_REPLICATION "name"
>
> The SQL variant currrently has:
> * init_logical_replication(name, plugin)
> * start_logical_replication(name, stream_upto, options[])
> * stop_logical_replication(name)
>
> You might have noticed the slight inconsistency...

I think this naming is probably not the greatest. When I hear "init",
I don't think "permanently allocate a resource that will never be
released unless I explicitly throw it away", and when I hear "stop", I
don't think "free that resource". I suggest naming based around
create/drop, register/unregister, or acquire/release. Since, as
previously noted, I'm gunning for these slots to apply to ordinary
replication as well, I kind of like ACQUIRE_REPLICATION_SLOT and
RELEASE_REPLICATION_SLOT. If we're going to make them specific to
logical replication, then your suggestion of CREATE_DECODING_SLOT and
DROP_DECODING_SLOT, or maybe ACQUIRE_DECODING_SLOT and
RELEASE_DECODING_SLOT, sounds fine.

It also strikes me that just as it's possible to stream WAL without
allocating a slot first (since we don't at present have slots),
perhaps it ought also to be possible to stream logical replication
data without acquiring a slot first. You could argue that it was a
mistake not to introduce slots in the first place, but the stateless
nature of WAL streaming definitely has some benefits, and it's unclear
to me why you shouldn't be able to do the same thing with logical
decoding.

> b) Decide which of the SQL functions should be in a contrib module, and
> which in core. Currently init_logical_replication() and
> stop_logical_replication() are in core, whereas
> start_logical_replication() is in the 'test_logical_decoding'
> extension. The reasoning behind that is that init/stop ones are
> important to the DBA and the start_logical_replication() SRF isn't
> all that useful in the real world because our SRFs don't support
> streaming changes out.

Seems pretty arbitrary to me. If start_logical_replication() is
usable with any output plugin, then it ought to be in core. I think
the name isn't great, though; the actual functionality seems to be
more or less decode-from-last-position-up-to-present, which doesn't
sound much like "start".

> c) Which data-types does start_logical_replication() return. Currently
> it's OUT location text, OUT xid bigint, OUT data text. Making the 'data'
> column text has some obvious disadvantages though - there's obvious
> usecases for output plugins that return binary data. But making it bytea
> sucks, because the output is harder to read by default...

I think having two functions might be sensible. I'm not sure what
happens if the text function is used and the plugin outputs something
that's not valid in the database encoding, though. I guess you'd
better check for that and error out.

> d) is my main question, and Robert, Peter G. and I previously argued
> about it a fair bit. I know of the following alternatives:
>
> I) The output plugin that's specified in INIT_LOGICAL_REPLICATION is
> actually a library name, and we simply lookup the fixed symbol names in
> it. That's what currently implemented.
> The advantage is that it's pretty easy to implement, works on a HS
> standby without involving the primary, and doesn't have a problem if the
> library is used in shared_preload_library.
> The disadvantages are: All output plugins need to be shared libraries
> and there can only be one output plugin per shared library (although you
> could route differently, via options, but ugh).

I still dislike this.

> II) Keep the output plugin a library, but only lookup a
> _PG_init_output_plugin() which registers/returns the callbacks. Pretty
> much the same tradeoffs as I)
>
> III) Keep the output plugin a library, but simply rely on _PG_init()
> calling a function to register all callbacks. Imo it's worse than I) and
> II) because it basically prohibits using the library in
> shared_preload_libraries as well, because then it's _PG_init() doesn't
> get called when starting to stream, and another library might have
> registered other callbacks.

I don't understand the disadvantage that you're describing here. What
I'm imagining is that you have some struct that looks like this:

struct output_plugin
{
char name[64];
void (*begin)(args);
void (*change)(args);
void (*commit)(args);
};

Now you provide a function RegisterOutputPlugin(output_plugin *). If
there are any output plugins built into core, core will call
RegisterOutputPlugin once for each one. If a shared library
containing an output plugin is loaded, the libraries _PG_init function
does the same thing. When someone tries to use a plugin, they ask for
it by name. We go iterate through the data saved by all previous
calls to RegisterOutputPlugin() until we find one with a matching
name, and then we use the callbacks embedded in that struct.

It doesn't matter when _PG_init() runs, except that it had better run
before the output plugin is requested by name. It also doesn't matter
that another library may have registered other callbacks, because we
use the name to decide which output plugin to call, from among all
those we know about.

> IV) Make output plugins a SQL-level object/catalog table where a plugin
> can be registered, and the callbacks are normal pg_proc entries. It's
> more in line with other stuff, but has the disadvantage that we need to
> register plugins on the primary, even if we only stream from a
> standby. But then, we're used to that with CREATE EXTENSION et al.

I don't think I'd make every callback a pg_proc entry; I'd make a
single pg_proc entry that returns a struct of function pointers, as we
do for FDWs. But I think this has merit. One significant advantage
of it over (III) is that execution of a function in pg_proc can
trigger a library load without any extra pushups, which is nice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2013-12-12 03:42:52 Re: pgbench with large scale factor
Previous Message Tom Lane 2013-12-12 03:08:44 Re: SSL: better default ciphersuite