Re: Finalizing logical replication limitations as well as potential features

From: Alvaro Hernandez <aht(at)ongres(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Finalizing logical replication limitations as well as potential features
Date: 2018-01-09 05:25:48
Message-ID: 2ae773b9-841f-a0b3-9b97-48e8c14eff00@ongres.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/01/18 05:35, Joshua D. Drake wrote:
> On 01/04/2018 01:26 PM, Alvaro Herrera wrote:
>> Joshua D. Drake wrote:
>>
>>> We just queue/audit the changes as they happen and sync up the changes
>>> after the initial sync completes.
>> This already happens.  There is an initial sync, and there's logical
>> decoding that queues any changes that exist "after" the sync's snapshot.
>>
>> What you seem to want is to have multiple processes doing the initial
>> COPY in parallel -- each doing one fraction of the table.  Of course,
>> they would have to use the same snapshot.  That would make sense only
>> if the COPY itself is the bottleneck and not the network, or the I/O
>> speed of the origin server.  This doesn't sound a common scenario to me.
>
> Not quite but close. My thought process is that we don't want to sync
> within a single snapshot a 100-500mil row table (or worse). Unless I
> am missing something there, that has the potential to be a very long
> running transaction especially if we are syncing more than one relation.
>
> JD
>

    That's indeed the way it works, you need to hold the snapshot
possibly for a long time. But not doing so seems to go a very complex,
even though it's not impossible. Changes after initial sync are
definitely registered (via logical decoding), that's not an issue. But
if you don't keep a snapshot of the database, you will also see some or
all of these changes applied to the tables mid-way. How to make the
whole table copy consistent with potential mid-way changes and the
recorded ones on logical decoding is difficult and bug-prone.

    Surprisingly, this is how MongoDB replication works, as they don't
have the equivalent of a snapshot facility. But actually they need to do
really weird stuff, like re-applying changes up to 3 (why?) times and
comments on the source code point to strange hacks to make all
consistent. I (want to) believe they made it correctly, but it is hacky,
complicated, and MongoDB doesn't support FKs and other features that I'm
sure complicate matters even more.

    I'm not a PG hacker, but all this sounds too complicated to me. I'd
keep the snapshot open that makes things very easy. If inside you want
to do parallel COPY, that's fine (if, as the other Álvaro said, it is
COPY the limiting factor).

    Cheers,

    Álvaro

--

Alvaro Hernandez

-----------
OnGres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2018-01-09 05:44:53 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Previous Message Michael Paquier 2018-01-09 05:17:53 Re: [HACKERS] taking stdbool.h into use