Re: pg_rewind tap test unstable

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Christoph Berg <myon(at)debian(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_rewind tap test unstable
Date: 2015-09-07 04:37:21
Message-ID: CAB7nPqTn2-8ATDqcOyZA+zt0K6y5xE+WNS-4fo_VwZy_JSZ5dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 7, 2015 at 1:16 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Aug 04, 2015 at 02:21:16PM +0900, Michael Paquier wrote:
>> >> On Tue, Jul 28, 2015 at 5:57 PM, Christoph Berg <myon(at)debian(dot)org> wrote:
>> >> > for something between 10% and 20% of the devel builds for apt.postgresql.org
>> >> > (which happen every 6h if there's a git change, so it happens every few days),
>> >> > I'm seeing this:
>
>> In test case 2, the failure happens to be that the standby did not
>> have the time to replicate the database beforepromotion that has been
>> created on the master. One possible explanation for this failure is
>> that the standby has been promoted before all the WAL needed for the
>> tests has been replayed, hence we had better be sure that the
>> replay_location of the standby matches pg_current_xlog_location()
>> before promotion.
>
>> Perhaps the attached patch helps?
>
> Thanks. In light of your diagnosis, I can reliably reproduce the failure by
> injecting a sleep into XLogSendPhysical(). Your patch fixes the problem, but
> it adds wal_receiver_status_interval (= 10s) stalls, doubling
> src/bin/pg_rewind/t/001_basic.pl runtime on a fast system. (The standby
> applies the final WAL quickly, then sleeps for wal_receiver_status_interval
> before notifying the master.)

Indeed, thanks for double-checking.

> The standby will apply any written, unapplied
> WAL during promotion. Therefore, I plan to commit the attached
> performance-neutral variant of your patch.

Explaining the use of write_location. This looks fine to me. Thanks again.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-09-07 05:19:49 Re: checkpointer continuous flushing
Previous Message Thomas Munro 2015-09-07 04:35:59 Re: Reusing abbreviated keys during second pass of ordered [set] aggregates