Re: Logical Replication WIP

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical Replication WIP
Date: 2016-08-11 14:43:03
Message-ID: 17b4b439-74e3-8a00-ac14-d02ee3a761e6@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/08/16 13:34, Stas Kelvich wrote:
>
> * max_logical_replication_workers mentioned everywhere in docs, but guc.c defines
> variable called max_logical_replication_processes for postgresql.conf
>

Ah changed it in code but not in docs, will fix.

> * Since pg_subscription already shared across the cluster, it can be also handy to
> share pg_publications too and allow publication of tables from different databases. That
> is rare scenarios but quite important for virtual hosting use case — tons of small databases
> in a single postgres cluster.

You can't decode changes from multiple databases in one slot so I don't
see the usefulness there. The pg_subscription is currently shared
because it's technical necessity (as in I don't see how to solve the
need to access the catalog from launcher in any other way) not because I
think it's great design :)

>
> * There is no way to see attachecd tables/schemas to publication through \drp
>

That's mostly intentional as publications for table are visible in \d,
but I am not against adding it to \drp.

> * As far as I understand there is no way to add table/tablespace right in CREATE
> PUBLICATION and one need explicitly do ALTER PUBLICATION right after creation.
> May be add something like WITH TABLE/TABLESPACE to CREATE?
>

Yes, as I said to Masahiko Sawada, it's just not there yet but I plan to
have that.

> * So binary protocol goes into core. Is it still possible to use it as decoding plugin for
> manually created walsender? May be also include json as it was in pglogical? While
> i’m not arguing that it should be done, i’m interested about your opinion on that.
>

Well the plugin is bit more integrated into the publication infra so if
somebody would want to use it directly they'd have to use that part as
well. OTOH the protocol itself is provided as API so it's reusable by
other plugins if needed.

JSON plugin is something that would be nice to have in core as well, but
I don't think it's part of this patch.

> * Also I’ve noted that you got rid of reserved byte (flags) in protocol comparing to
> pglogical_native. It was very handy to use it for two phase tx decoding (0 — usual
> commit, 1 — prepare, 2 — commit prepared), because both prepare and commit
> prepared generates commit record in xlog.

Hmm maybe commit message could get it back. PGLogical has them sprinkled
all around the protocol which I don't really like so I want to limit
them to the places where they are actually useful.

>
>> On 05 Aug 2016, at 18:00, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>>
>> - DDL, I see several approaches we could do here for 10.0. a) don't
>> deal with DDL at all yet, b) provide function which pushes the DDL
>> into replication queue and then executes on downstream (like
>> londiste, slony, pglogical do), c) capture the DDL query as text
>> and allow user defined function to be called with that DDL text on
>> the subscriber
>
> * Since here DDL is mostly ALTER / CREATE / DROP TABLE (or am I wrong?) may be
> we can add something like WITH SUBSCRIBERS to statements?
>

Not sure I follow. How does that help?

> * Talking about exact mechanism of DDL replication I like you variant b), but since we
> have transactional DDL, we can do two phase commit here. That will require two phase
> decoding and some logic about catching prepare responses through logical messages. If that
> approach sounds interesting i can describe proposal in more details and create a patch.
>

I'd think that such approach is somewhat more interesting with c)
honestly. The difference between b) and c) is mostly about explicit vs
implicit. I definitely would like to see the 2PC patch updated to work
with this. But maybe it's wise to wait a while until the core of the
patch stabilizes during the discussion.

> * Also I wasn’t able actually to run replication itself =) While regression tests passes, TAP
> tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’. I’ll investigate
> that more and write again.

Interesting, please keep me posted. It's possible for tables to stay in
's' state for some time if there is nothing happening on the server, but
that should not mean anything is stuck.

>
> * As far as I understand sync starts automatically on enabling publication. May be split that
> logic into a different command with some options? Like don’t sync at all for example.
>

I think SYNC should be option of subscription creation just like
INITIALLY ENABLED/DISABLED is. And then there should be interface to
resync a table manually (like pglogical has). Not yet sure how that
interface should look like in terms of DDL though.

> * When I’m trying to create subscription to non-existent publication, CREATE SUBSRITION
> creates replication slot and do not destroys it:
>
> # create subscription sub connection 'host=127.0.0.1 dbname=postgres' publication mypub;
> NOTICE: created replication slot "sub" on provider
> ERROR: could not receive list of replicated tables from the provider: ERROR: cache lookup failed for publication 0
> CONTEXT: slot "sub", output plugin "pgoutput", in the list_tables callback
>
> after that:
>
> postgres=# drop subscription sub;
> ERROR: subscription "sub" does not exist
> postgres=# create subscription sub connection 'host=127.0.0.1 dbname=postgres' publication pub;
> ERROR: could not crate replication slot "sub": ERROR: replication slot "sub" already exists
>

See the TODO in CreateSubscription function :)

> * Also can’t drop subscription:
>
> postgres=# \drs
> List of subscriptions
> Name | Database | Enabled | Publication | Conninfo
> ------+----------+---------+-------------+--------------------------------
> sub | postgres | t | {mypub} | host=127.0.0.1 dbname=postgres
> (1 row)
>
> postgres=# drop subscription sub;
> ERROR: unrecognized object class: 6102

Yes that has been already reported.

>
> * Several time i’ve run in a situation where provider's postmaster ignores Ctrl-C until subscribed
> node is switched off.
>

Hmm I guess there is bug in signal processing code somewhere.

> * Patch with small typos fixed attached.
>
> I’ll do more testing, just want to share what i have so far.
>

Thanks for both.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tobias Bussmann 2016-08-11 14:43:58 Re: extract text from XML
Previous Message Shay Rojansky 2016-08-11 14:18:17 Re: Slowness of extended protocol