Logical Replication WIP

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Logical Replication WIP
Date: 2016-08-05 15:00:13
Message-ID: 37e19ad5-f667-2fe2-b95b-bba69c5b6c68@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

as promised here is WIP version of logical replication patch.

This is by no means anywhere close to be committable, but it should be
enough for discussion on the approaches chosen. I do plan to give this
some more time before September CF as well as during the CF itself.

You've seen some preview of ideas in the doc Simon posted [1], not all
of them are implemented yet in this patch though.

I'll start with the overview of the state of things.

What works:
- Replication of INSERT/UPDATE/DELETE operations on tables in
publication.
- Initial copy of data in publication.
- Automatic management of things like slots and origin tracking.
- Some psql support (\drp, \drs and additional info in \d for
tables, it's mainly missing ACLs as those are not implemented
(see bellow) yet and tab completion.

What's missing:
- sequences, I'd like to have them in 10.0 but I don't have good
way to implement it. PGLogical uses periodical syncing with some
buffer value but that's suboptimal. I would like to decode them
but that has proven to be complicated due to their sometimes
transactional sometimes nontransactional nature, so I probably
won't have time to do it within 10.0 by myself.
- ACLs, I still expect to have it the way it's documented in the
logical replication docs, but currently the code just assumes
superuser/REPLICATION role. This can be probably discussed in the
design thread more [1].
- pg_dump, same as above, I want to have publications and membership
in those dumped unconditionally and potentially dump also
subscription definitions if user asks for it using commandline
option as I don't think subscriptions should be dumped by default as
automatically starting replication when somebody dumps and restores
the db goes against POLA.
- DDL, I see several approaches we could do here for 10.0. a) don't
deal with DDL at all yet, b) provide function which pushes the DDL
into replication queue and then executes on downstream (like
londiste, slony, pglogical do), c) capture the DDL query as text
and allow user defined function to be called with that DDL text on
the subscriber (that's what oracle did with CDC)
- FDW support on downstream, currently only INSERTs should work
there but that should be easy to fix.
- Monitoring, I'd like to add some pg_stat_subscription view on the
downstream (the rest of monitoring is very similar to physical
streaming so that needs mostly docs).
- TRUNCATE, this is handled using triggers in BDR and pglogical but
I am not convinced that's the right way to do it for incore as it
brings limitations (fe. inability to use restart identity).

The parts I am not overly happy with:
- The fact that subscription handles slot creation/drop means we do
some automagic that might fail and user might need to fix that up
manually. I am not saying this is necessarily problem as that's how
most of the publish/subscribe replication systems work but I wonder
if there is better way of doing this that I missed.
- The initial copy patch adds some interfaces for getting table list
and data into the DecodingContext and I wonder if that's good place
for those or if we should create some TableSync API instead that
would load plugin as well and have these two new interfaces and put
into the tablesync module. One reason why I didn't do it is that
the interface would be almost the same and the plugin then would
have to do separate init for DecodingContext and TableSync.
- The initial copy uses the snapshot from slot creation in the
walsender. I currently just push it as active snapshot inside
snapbuilder which is probably not the right thing to do (tm). That
is mostly because I don't really know what the right thing is there.

About individual pathes:
0001-Add-PUBLICATION-catalogs-and-DDL.patch: This patch defines a
Publication which his basically same thing as replication set. It adds
database local catalog pg_publication which stores the publications and
DML filters, and pg_publication_rel catalog for storing membership of
relation in the publication. Adds the DDL, dependency handling and all
the necessary boilerplate around that including some basic regression
tests for the DDL.

0002-Add-SUBSCRIPTION-catalog-and-DDL.patch: Adds Subscriptions with
shared nailed (!) catalog pg_subscription which stores the individual
subscriptions for each database. The reason why this is nailed is that
it needs to be accessible without connection to database so that the
logical replication launcher can read it and start/stop workers as
necessary. This does not include regression tests as I am usure how to
test this within regression testing framework given that it is
supposed to start workers (those are added in later patches).

0003-Define-logical-replication-protocol-and-output-plugi.patch:
Adds the logical replication protocol (api and docs) and "standard"
output plugin for logical decoding that produces output based on that
protocol and the publication definitions.

0004-Make-libpqwalreceiver-reentrant.patch: Redesigns the
libpqwalreceiver to be reusable outside of walreceiver by exporting
the api as struct and opaque connection handle. Also adds couple of
additional functions for logical replication.

0005-Add-logical-replication-workers.patch: This patch adds the actual
logical replication workers that use all above to implement the data
change replication from publisher to subscriber. It adds two different
background workers. First is Launcher which works like the autovacuum
laucnher in that it gets list of subscriptions and starts/stops the
apply workers for those subscriptions as needed. Apply workers connect
to the output plugin via streaming protocol and handle the actual data
replication. I exported the ExecUpdate/ExecInsert/ExecDelete functions
from nodeModifyTable to handle the actual database updates so that
things like triggers, etc are handled automatically without special
code. This also adds couple of TAP tests that test basic replication
setup and also wide variety of type support. Also the overview doc for
logical replication that Simon previously posted to the list is part
of this one.

0006-Logical-replication-support-for-initial-data-copy.patch: PoC of
initial sync. It adds another mode into apply worker which just applies
updates for single table and some handover logic for when the table is
given synchronized and can be replicated normally. It also adds new
catalog pg_subscription_rel which keeps information about
synchronization status of individual tables. Note that tables added to
publications at later time are not yet synchronized, there is also no
resynchronization UI yet.

On the upstream side it adds two new commands into replication protocol
for getting list of tables and for streaming existing table data. I
discussed this part as suboptimal above so won't repeat here.

Feedback is welcome.

[1]
https://www.postgresql.org/message-id/flat/CANP8%2Bj%2BNMHP-yFvoG03tpb4_s7GdmnCriEEOJeKkXWmUu_%3D-HA%40mail(dot)gmail(dot)com#CANP8+j+NMHP-yFvoG03tpb4_s7GdmnCriEEOJeKkXWmUu_=-HA(at)mail(dot)gmail(dot)com

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-PUBLICATION-catalogs-and-DDL.patch application/x-patch 90.9 KB
0002-Add-SUBSCRIPTION-catalog-and-DDL.patch application/x-patch 64.3 KB
0003-Define-logical-replication-protocol-and-output-plugi.patch application/x-patch 53.3 KB
0004-Make-libpqwalreceiver-reentrant.patch application/x-patch 29.8 KB
0005-Add-logical-replication-workers.patch application/x-patch 115.5 KB
0006-Logical-replication-support-for-initial-data-copy.patch application/x-patch 95.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-08-05 15:05:31 Re: Re: [COMMITTERS] pgsql: Prevent "snapshot too old" from trying to return pruned TOAST tu
Previous Message Peter Eisentraut 2016-08-05 14:57:35 Re: pg_size_pretty, SHOW, and spaces