Re: pglogical_output - a general purpose logical decoding output plugin

From: Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pglogical_output - a general purpose logical decoding output plugin
Date: 2016-01-21 22:13:37
Message-ID: 20160121221337.21436.15825.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

Documentation - although I haven't yet went through protocol documentation:

README.md

+ data stream. The output stream is designed to be compact and fast to decode,
+ and the plugin supports upstream filtering of data so that only the required
+ information is sent.

plugin supports upstream filtering of data through hooks so that ...

+ subset of that database may be selected for replication, currently based on
+ table and on replication origin. Filtering by a WHERE clause can be supported
+ easily in future.

Is this filtering by table and replication origin implemented? I haven't
noticed it in source.

+ other daemon is required. It's accumulated using

Stream of changes is accumulated...

+ [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or `START_REPLICATION SLOT ... LOGICAL ...` commands](http://www.postgresql.org/docs/current/static/logicaldecoding-walsender.html) to start streaming changes. (It can also be used via
+ [SQL level functions](http://www.postgresql.org/docs/current/static/logicaldecoding-sql.html)
+ over a non-replication connection, but this is mainly for debugging purposes)

Replication slot can also be configured (causing output plugin to be loaded) via [SQL level functions]...

+ The overall flow of client/server interaction is:

The overall flow of client/server interaction is as follows:

+ * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` if it's setting up for the first time

* Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` to setup replication if it's connecting for the first time

+ Details are in the replication protocol docs.

Add link to file with protocol documentation.

+ If your application creates its own slots on first use and hasn't previously
+ connected to this database on this system you'll need to create a replication
+ slot. This keeps track of the client's replay state even while it's disconnected.

If your application hasn't previously connected to this database on this system
it'll need to create and configure replication slot which keeps track of the
client's replay state even while it's disconnected.

+ `pglogical`'s output plugin now sends a continuous series of `CopyData`

As this is separate plugin, use 'pglogical_output' plugin now sends...
(not only here but also in few other places).

+ All hooks are called in their own memory context, which lasts for the duration

All hooks are called in separate hook memory context, which lasts for the duration...

+ + switched to a longer lived memory context like TopMemoryContext. Memory allocated
+ + in the hook context will be automatically when the decoding session shuts down.

...will be automatically freed when the decoding...

DDL for global object changes must be synchronized via some external means.

Just:
Global object changes must be synchronized via some external means.

+ determine why an error occurs in a downstream, since you can examine a
+ json-ified representation of the xact. It's necessary to supply a minimal

since you can examine a transaction in json (and not binary) format. It's necessary

+ discard up to, as identifed by LSN (log sequence number). See

identified

+ Once you've peeked the stream and know the LSN you want to discard up to, you
+ can use `pg_logical_slot_peek_changes`, specifying an `upto_lsn`, to consume

Shouldn't it be pg_logical_slot_get_changes? get_changes consumes changes,
peek_changes leaves them in the stream. Especially as example below
points that we need to use get_changes.

+ tp to but not including that point, i.e. that will be the
+ point at which replay resumes.

IMO it's better to introduce new sentence:
that point. This will be the point at which replay resumes.

DESIGN.md:

+ attnos don't necessarily correspond. The column names might, and their ordering
+ might even be the same, but any column drop or column type change will result

The column names and their ordering might even be the same...

README.pglogical_output_plhooks:

+ Note that pglogical
+ must already be installed so that its headers can be found.

Note that pglogical_output must already...

+ Arguments are the oid of the affected relation, and the change type: 'I'nsert,
+ 'U'pdate or 'D'elete. There is no way to access the change data - columns changed,
+ new values, etc.

Is it true (no way to access change data)? You added passing change
to C hooks; from looking at code it looks like it's true, but I want to be sure.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-01-21 22:35:38 Re: WIP: Covering + unique indexes.
Previous Message David Rowley 2016-01-21 21:08:26 Re: Combining Aggregates