Re: Re: pglogical_output - a general purpose logical decoding output plugin

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: pglogical_output - a general purpose logical decoding output plugin
Date: 2016-01-22 03:06:03
Message-ID: CAMsr+YHZH_rApcTZ36G27agS6igyYdW_aXw9PPLbh2JCEhGq9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22 January 2016 at 06:13, Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl> wrote:

> + data stream. The output stream is designed to be compact and fast to
> decode,
> + and the plugin supports upstream filtering of data so that only the
> required
> + information is sent.
>
> plugin supports upstream filtering of data through hooks so that ...
>
>
Ok.

> + subset of that database may be selected for replication, currently based
> on
> + table and on replication origin. Filtering by a WHERE clause can be
> supported
> + easily in future.
>
> Is this filtering by table and replication origin implemented? I haven't
> noticed it in source.
>

That's what the hooks are for.

> + other daemon is required. It's accumulated using
>
> Stream of changes is accumulated...
>

Ok.

> + [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or `START_REPLICATION
> SLOT ... LOGICAL ...` commands](
> http://www.postgresql.org/docs/current/static/logicaldecoding-walsender.html)
> to start streaming changes. (It can also be used via
> + [SQL level functions](
> http://www.postgresql.org/docs/current/static/logicaldecoding-sql.html)
> + over a non-replication connection, but this is mainly for debugging
> purposes)
>
> Replication slot can also be configured (causing output plugin to be
> loaded) via [SQL level functions]...
>

Covered in the next section. Or at least it is in the SGML docs conversion
I'm still trying to finish off..

> + * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'`
> if it's setting up for the first time
>
> * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` to
> setup replication if it's connecting for the first time
>

I disagree. It's entirely possible to do your slot creation/setup manually
or via something else, re-use a slot first created by another node, etc.
Slot creation is part of client setup, not so much connection.

> + Details are in the replication protocol docs.
>
> Add link to file with protocol documentation.
>

Is done in SGML conversion.

> + If your application creates its own slots on first use and hasn't
> previously
> + connected to this database on this system you'll need to create a
> replication
> + slot. This keeps track of the client's replay state even while it's
> disconnected.
>
> If your application hasn't previously connected to this database on this
> system
> it'll need to create and configure replication slot which keeps track of
> the
> client's replay state even while it's disconnected.
>

As above, I don't quite agree.

> + `pglogical`'s output plugin now sends a continuous series of `CopyData`
>
> As this is separate plugin, use 'pglogical_output' plugin now sends...
> (not only here but also in few other places).
>

Thanks. Done.

+ All hooks are called in their own memory context, which lasts for the
> duration
>
> All hooks are called in separate hook memory context, which lasts for the
> duration...
>

I don't see the difference, but OK.

Simplified:

> All hooks are called in a memory context that lasts ...

> + + switched to a longer lived memory context like TopMemoryContext.
> Memory allocated
> + + in the hook context will be automatically when the decoding session
> shuts down.
>
> ...will be automatically freed when the decoding...
>

Fixed, thanks.

> DDL for global object changes must be synchronized via some external means.
>
> Just:
> Global object changes must be synchronized via some external means.
>

Agree, done.

> + determine why an error occurs in a downstream, since you can examine a
> + json-ified representation of the xact. It's necessary to supply a minimal
>
> since you can examine a transaction in json (and not binary) format. It's
> necessary
>

Ok, done.

> + Once you've peeked the stream and know the LSN you want to discard up
> to, you
> + can use `pg_logical_slot_peek_changes`, specifying an `upto_lsn`, to
> consume
>
> Shouldn't it be pg_logical_slot_get_changes? get_changes consumes changes,
> peek_changes leaves them in the stream. Especially as example below
> points that we need to use get_changes.

Yes. It should. Whoops. Thanks.

> DESIGN.md:
>
> + attnos don't necessarily correspond. The column names might, and their
> ordering
> + might even be the same, but any column drop or column type change will
> result
>
> The column names and their ordering might even be the same...
>

I disagree, that has a different meaning. It's also not really user-facing
docs so I'm not too worried about being quite as readable.

README.pglogical_output_plhooks:
>
> + Note that pglogical
> + must already be installed so that its headers can be found.
>
> Note that pglogical_output must already...
>

Thanks.

> + Arguments are the oid of the affected relation, and the change type:
> 'I'nsert,
> + 'U'pdate or 'D'elete. There is no way to access the change data -
> columns changed,
> + new values, etc.
>
> Is it true (no way to access change data)? You added passing change
> to C hooks; from looking at code it looks like it's true, but I want to be
> sure.

While the change data is now passed to the C hook, there's no attempt to
expose it via PL/PgSQL. So yeah, that's still true.

Thanks again for this. Sorry it's taken so long to get the SGML docs
converted. Too many other things on.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-01-22 03:28:11 Re: Extracting fields from 'infinity'::TIMESTAMP[TZ]
Previous Message Andres Freund 2016-01-22 02:33:17 Re: log_checkpoint's "0 transaction log file(s) added" is extremely misleading