Re: Logical replication existing data copy

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication existing data copy
Date: 2017-02-13 13:51:46
Message-ID: b03bc0134161272b6bc4577b96c59de2@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-02-11 11:16, Erik Rijkers wrote:
> On 2017-02-08 23:25, Petr Jelinek wrote:
>
>> 0001-Use-asynchronous-connect-API-in-libpqwalreceiver-v2.patch
>> 0002-Always-initialize-stringinfo-buffers-in-walsender-v2.patch
>> 0003-Fix-after-trigger-execution-in-logical-replication-v2.patch
>> 0004-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION-v2.patch
>> 0001-Logical-replication-support-for-initial-data-copy-v4.patch
>
> This often works but it also fails far too often (in my hands). I
> test whether the tables are identical by comparing an md5 from an
> ordered resultset, from both replica and master. I estimate that 1 in
> 5 tries fail; 'fail' being a somewhat different table on replica
> (compared to mater), most often pgbench_accounts (typically there are
> 10-30 differing rows). No errors or warnings in either logfile. I'm
> not sure but I think testing on faster machines seem to be doing
> somewhat better ('better' being less replication error).
>

I have noticed that when I insert a few seconds wait-state after the
create subscription (or actually: the 'enable'ing of the subscription)
the problem does not occur. Apparently, (I assume) the initial snapshot
occurs somewhere when the subsequent pgbench-run has already started, so
that the logical replication also starts somewhere 'into' that
pgbench-run. Does that make sense?

I don't know what to make of it. Now that I think that I understand
what happens I hesitate to call it a bug. But I'd say it's still a
useability problem that the subscription is only 'valid' after some
time, even if it's only a few seconds.

(the other problem I mentioned (drop subscription hangs) still happens
every now and then)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2017-02-13 13:55:42 Re: Should we cacheline align PGXACT?
Previous Message Dilip Kumar 2017-02-13 13:48:21 Re: Parallel bitmap heap scan