Re: Built-in plugin for logical decoding output

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Gregory Brail <gregbrail(at)google(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Built-in plugin for logical decoding output
Date: 2017-09-24 14:15:33
Message-ID: CAMsr+YE0qbhmw9f0auPkoySZNEU=vps-4gHjKrLqNSHFgYo5PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23 September 2017 at 06:28, Gregory Brail <gregbrail(at)google(dot)com> wrote:

> Would the community support the development of another plugin that is
> distributed as part of "contrib" that addresses these issues?
>

Petr Jelinek and I tried just that with pglogical. Our submission was
knocked back with the complaint that there was no in-core user of the code,
and it couldn't be evaluated usefully without an in-core consumer/receiver.

It's possible we'd make more progress if we tried again now, since we could
probably write a test suite using the TAP test framework and a small
src/test/modules consumer. But now we'd probably instead get blocked with
the complaint that the output plugin used for logical replication should be
sufficient for any reasonable need. I anticipate that we'd have some
disagreements about what a reasonable need is, but ... *shrug*.

I personally think we _should_ have such a thing, and that it should be
separate to the logical replication plugin to allow us to evolve that
without worrying about out of core dependencies etc.

There's some common functionality that needs factoring out into the logical
decoding framework, like some sort of relation metadata cache, some concept
of "replication sets" or a set of tables to include/exclude, etc. Doing
that is non-trivial work, but it's unlikely that two plugins with similar
and overlapping implementations of such things would be accepted; in that
case I'd be firmly in the "no" camp too.

Code in Pg has a cost, and we do have to justify that cost when we drop
things in contrib/. It's not a free slush pile. So a solid argument does
need to be made for why having this module living in github/whatever isn't
good enough.

I'd be happy to submit a patch, or GitHub repo, or whatever works best as
> an example. (Also, although Transicator uses protobuf, I'm happy to have it
> output a simple binary format as well.)
>

PostgreSQL tends to be very, very conservative about dependencies and
favours (not-)-invented-here rather heavily. Optional dependencies are
accepted sometimes when they can be neatly isolated to one portion of the
codebase and/or abstracted away, so it's not impossible you'd get
acceptance for something like protocol buffers. But there's pretty much
zero chance you'll get it as a hard dependency, you'll need a simple text
and binary protocol too.

At which point the question will arise, why aren't these 3 separate output
plugins? The text one, the binary one for in-core and the protobuf one to
be maintained out of core.

That's a pretty sensible question. The answer is that they'll all need to
share quite a bit of common infrastructure. But if that's infrastructure
all plugins need, shouldn't it be pushed "up" into the logical decoding
layer's supporting framework? Patches welcome for the next major release
cycle.

Thus, that's where I think you should actually start. Extract (and where
necessary generalize) key parts of your code that should be provided by
postgres its self, not implemented by each plugin. And submit it so all
plugins can share it and yours can be simpler. Eventually to the point
where output plugins are often simple format wrappers.

You might want to look at

* pglogical's output plugin; and
* bottled-water

for ideas about things that would benefit from shared infrastructure, and
ways to generalize it. I will be very happy to help there as time permits.

> As a side note, doing this would also help making logical decoding a
> useful feature for customers of Amazon and Google's built-in Postgres
> hosting options.
>

Colour me totally unconvinced there. Either, or both, can simply bless
out-of-tree plugins as it is; after all, they can and do patch the core
server freely too.

It'd *help* encourage them both to pick the same plugin, but that's about
it. And only if the plugin could satisfy their various constraints about no
true superuser access, etc.

I guess I'm a bit frustrated, because *I tried this*, and where was anyone
from Google or Amazon then? But now there's a new home-invented plugin that
we should adopt, ignoring any of the existing ones. Why?

https://github.com/apigee-labs/transicator/tree/master/pgoutput
>

No README?

Why did this need to be invented, rather than using an existing plugin?

I don't mind, I mean, it's great that you're using the plugin
infrastructure and using postgres. I'm just curious what bottled-water,
pglogical, etc lacked, what made you go your own way?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-09-24 14:20:22 Re: Built-in plugin for logical decoding output
Previous Message David Steele 2017-09-24 13:47:41 Re: OpenFile() Permissions Refactor