Re: Logical replication existing data copy

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Logical replication existing data copy
Date: 2017-03-09 10:06:57
Message-ID: 93d02794068482f96d31b002e0eb248d@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-03-08 10:36, Petr Jelinek wrote:
> On 07/03/17 23:30, Erik Rijkers wrote:
>> On 2017-03-06 11:27, Petr Jelinek wrote:
>>
>>> 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
>>> 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
>>> 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch +
>>> 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch +
>>> 0005-Skip-unnecessary-snapshot-builds.patch +
>>> 0001-Logical-replication-support-for-initial-data-copy-v6.patch
>>
>> I use three different machines (2 desktop, 1 server) to test logical
>> replication, and all three have now at least once failed to correctly
>> synchronise a pgbench session (amidst many succesful runs, of course)
>
> yes waldump would be useful, the last segment should be enough, but
> possibly all segments mentioned in the log.

I've inserted a pg_waldump call in the program when the md5s remain the
same (on both master and replica) for 5 wait-cycles (= 5 * 5 seconds +
the time it takes to run (8x) the md5s + select count(*), which is more
than a minute on this slow disk (at least the first time))

>
> The other useful thing would be to turn on log_connections and
> log_replication_commands.

done. (so output will be in the log files.)

> pg_subscription_rel, pg_replication_origin_status on subscriber and
> pg_replication_slots on publisher

done. (added to logs)

The attached bz2 contains
- an output file from pgbench_derail2.sh (also attached, as it changes
somewhat all the time); the

- the pg_waldump output from both master (file with .1. in it) and
replica (.2.).

- the 2 logfiles.

file Name:
logrep.20170309_1021.1.1043.scale_25.clients_64.NOK.log

20170309_1021 is the start-time of the script
1 is master (2 is replica)
1043 is the time, 10:43, just before the pg_waldump call

The tail of the logfiles and of the output file will be a little ahead
of the pg_waldump (not more than a few minutes) because I didn't stop
the script while gathering the files manually.

HTH! Let me know if more is needed (more wal, for instance)

thanks,

Erik Rijkers

PS
the attached script now contains some idiosyncrasies of my setup so it
will probably complain here and there when run unaltered elsewhere)

Attachment Content-Type Size
sent_20170309_1044.tar.bz2 application/x-bzip2 115.6 KB
pgbench_derail2.sh text/x-shellscript 19.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-03-09 10:12:09 Re: Explicit subtransactions for PL/Tcl
Previous Message Victor Wagner 2017-03-09 09:25:03 Re: Explicit subtransactions for PL/Tcl