Re: Logical replication existing data copy

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical replication existing data copy
Date: 2017-03-09 17:46:45
Message-ID: f1724d0e-5f00-2f6d-5535-69512b79e133@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/6/17 05:27, Petr Jelinek wrote:
> updated and rebased version of the patch attached.

Some comments on this patch:

Can we have a better explanation of different snapshot options in
CREATE_REPLICATION_SLOT. We use all these variants in different
places, but it's not fully documented why. Think about interested
users reading this.

errmsg("subscription table %u in subscription %u does not exist",

Use names instead of IDs if possible.

+ libpqrcv_table_list,
+ libpqrcv_table_info,
+ libpqrcv_table_copy,

I don't think these functions belong into the WAL receiver API, since
they are specific to this particular logical replication
implementation. I would just make an API function libpqrc_exec_sql()
that runs a query, and then put the table_*() functions as wrappers
around that into tablesync.c.

Not sure what the worker init stuff in ApplyLauncherShmemInit() is
doing. Could be commented more.

There are a lot of places that do one of

MyLogicalRepWorker->relid == InvalidOid
OidIsValid(MyLogicalRepWorker->relid)

To check whether the current worker is a sync worker. Wrap that into
a function.

Not a fan of adding CommandCounterIncrement() calls at the end of
functions. In some cases, they are not necessary at all. In cases
where they are, the CCI call should be at a higher level between two
function calls with a comment for why the call below needs to see the
changes from the call above.

The index name pg_subscription_rel_map_index/SubscriptionRelMapIndexId
doesn't seem to match existing style, since there is no "map" column.
How about pg_subscription_rel_rel_sub_index? I see we have a
similarly named index for publications.

max_sync_workers_per_subscription could be PGC_SIGHUP, I think. And
the minimum could be 0, to stop any syncing?

pg_subscription_rel.h: I'm not sure under which circumstances we need
to use BKI_ROWTYPE_OID.

Does pg_subscription_rel need an OID column? The index
SubscriptionRelOidIndexId is not used anywhere.

You removed this command from the tests:

ALTER SUBSCRIPTION testsub SET PUBLICATION testpub, testpub1;

I suppose because it causes a connection. We should have an option to
prevent that (NOCONNECT/NOREFRESH?).

Why was the test 'check replication origin was dropped on subscriber'
removed?

Attached also a small patch with some stylistic changes.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-fixup-Logical-replication-support-for-initial-data-c.patch application/x-patch 18.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-09 17:48:05 Re: Logical replication existing data copy
Previous Message Robert Haas 2017-03-09 17:45:43 Re: Write Ahead Logging for Hash Indexes