|From:||Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>|
|To:||Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>|
|Cc:||Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: [HACKERS] Replication status in logical replication|
|Views:||Raw Message | Whole Thread | Download mbox|
On Tue, Nov 14, 2017 at 6:46 AM, Thomas Munro
> On Tue, Sep 26, 2017 at 3:45 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Tue, Sep 26, 2017 at 10:36 AM, Vaishnavi Prabakaran
>> <vaishnaviprabakaran(at)gmail(dot)com> wrote:
>>> On Wed, Sep 13, 2017 at 9:59 AM, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>>>> I’m not entirely sure why this was flagged as "Waiting for Author” by the
>>>> automatic run, the patch applies for me and builds so resetting back to
>>> This patch applies and build cleanly and I did a testing with one publisher
>>> and one subscriber, and confirm that the replication state after restarting
>>> the server now is "streaming" and not "Catchup".
>>> And, I don't find any issues with code and patch to me is ready for
>>> committer, marked the same in cf entry.
> Hi Sawada-san,
> My patch-testing robot doesn't like this patch. I just tried it on
> my laptop to double-check and get some more details, and saw the same
> (1) "make check" under src/test/recovery fails like this:
> t/006_logical_decoding.pl ............ 2/16 # Looks like your test
> exited with 29 just after 4.
> t/006_logical_decoding.pl ............ Dubious, test returned 29
> (wstat 7424, 0x1d00)
> Failed 12/16 subtests
> regress_log_006_logical_decoding says:
> ok 4 - got same expected output from pg_recvlogical decoding session
> pg_recvlogical timed out at
> /opt/local/lib/perl5/vendor_perl/5.24/IPC/Run.pm line 2918.
> waiting for endpos 0/1609B60 with stdout '', stderr '' at
> line 1700.
> ### Stopping node "master" using mode immediate
> # Running: pg_ctl -D
> -m immediate stop
> waiting for server to shut down.... done
> server stopped
> # No postmaster PID for node "master"
> # Looks like your test exited with 29 just after 4.
> (2) "make check" under src/test/subscription says:
> t/001_rep_changes.pl .. ok
> t/002_types.pl ........ #
> # Looks like your test exited with 60 before it could output anything.
> t/002_types.pl ........ Dubious, test returned 60 (wstat 15360, 0x3c00)
> Failed 3/3 subtests
> t/003_constraints.pl ..
> Each of those tooks several minutes, and I stopped it there. It may
> be going to say some more things but is taking a very long time
> (presumably timing out, but the 001 took ages and then succeeded...
> hmm). In fact I had to run this on my laptop to see that because on
> Travis CI the whole test job just gets killed after 10 minutes of
> non-output and the above output was never logged because of the way
> concurrent test jobs' output is buffered.
> I didn't try to figure out what is going wrong.
Thank you for the notification!
After investigation, I found out that my previous patch was wrong
direction. I should have changed XLogSendLogical() so that we can
check the read LSN and set WalSndCaughtUp = true even after read a
record without wait. Attached updated patch passed 'make check-world'.
Please review it.
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
|Next Message||Peter Geoghegan||2017-11-21 21:07:45||Re: [HACKERS] CLUSTER command progress monitor|
|Previous Message||Merlin Moncure||2017-11-21 21:05:23||Re: feature request: consume asynchronous notification via a function|