Re: Logical replication existing data copy

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical replication existing data copy
Date: 2017-03-10 04:51:59
Message-ID: 7a53e28a-1f54-2e84-fe0d-427d547abd4f@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/03/17 18:46, Peter Eisentraut wrote:
> On 3/6/17 05:27, Petr Jelinek wrote:
>> updated and rebased version of the patch attached.
>
> Some comments on this patch:
>
> Can we have a better explanation of different snapshot options in
> CREATE_REPLICATION_SLOT. We use all these variants in different
> places, but it's not fully documented why. Think about interested
> users reading this.
>

I am not quite sure what/where do you want me to expand the docs, the
protocol doc explains what the options do, are you missing reasoning for
why we use them in the calling code?

>
> errmsg("subscription table %u in subscription %u does not exist",
>
> Use names instead of IDs if possible.
>

Well, this message is more or less equal to cache lookup failure,
problem is that neither relation nor subscription are guaranteed to
exist if this fails. I can write code that spits two different messages
depending of they do, not quite sure if it's worth it there though.

>
> + libpqrcv_table_list,
> + libpqrcv_table_info,
> + libpqrcv_table_copy,
>
> I don't think these functions belong into the WAL receiver API, since
> they are specific to this particular logical replication
> implementation. I would just make an API function libpqrc_exec_sql()
> that runs a query, and then put the table_*() functions as wrappers
> around that into tablesync.c.
>

Yeah I was wondering about it as well. The reason why it's in
libpqwalreciver is that if we create some libpqrc_exec_sql as you say,
we'll need code that converts the PQresult to something consumable by
backend that has no access to libpq API and that seemed like quite a bit
of boilerplate work. I really don't want to write another libpq-like or
SPI-like interface for this. Maybe it would be acceptable to return
result as tuplestore?

>
> Not sure what the worker init stuff in ApplyLauncherShmemInit() is
> doing. Could be commented more.
>

Eh, I copy pasted comment there from different place, will fix.

>
> max_sync_workers_per_subscription could be PGC_SIGHUP, I think. And
> the minimum could be 0, to stop any syncing?
>
>
> pg_subscription_rel.h: I'm not sure under which circumstances we need
> to use BKI_ROWTYPE_OID.
>
>
> Does pg_subscription_rel need an OID column? The index
> SubscriptionRelOidIndexId is not used anywhere.
>

Ah, leftover from the time it used dependencies for tracking.

>
> You removed this command from the tests:
>
> ALTER SUBSCRIPTION testsub SET PUBLICATION testpub, testpub1;
>
> I suppose because it causes a connection. We should have an option to
> prevent that (NOCONNECT/NOREFRESH?).

NOREFRESH makes more sense I guess.

>
> Why was the test 'check replication origin was dropped on subscriber'
> removed?
>

I don't know what you mean?

>
> Attached also a small patch with some stylistic changes.
>

Thanks.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-03-10 04:52:39 Re: Logical replication existing data copy
Previous Message Tom Lane 2017-03-10 04:43:23 Re: Upgrading postmaster's log messages about bind/listen errors