Re: Another reason why the recovery tests take a long time

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Another reason why the recovery tests take a long time
Date: 2017-06-26 18:06:56
Message-ID: 31624.1498500416@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> So this looks like a pretty obvious race condition in the postmaster,
> which should be resolved by having it set a flag on receipt of
> PMSIGNAL_START_WALRECEIVER that's cleared only when it does start a
> new walreceiver.

Concretely, I propose the attached patch. Together with reducing
wal_retrieve_retry_interval to 500ms, which I propose having
PostgresNode::init do in its standard postgresql.conf adjustments,
this takes the runtime of the recovery TAP tests down from 2m50s
(after the patches I posted yesterday) to 1m30s.

I think there's still gold to be mined, because "top" is still
showing pretty low CPU load over most of the run, but this is
lots better than 4m30s.

regards, tom lane

Attachment Content-Type Size
walreceiver-restart-fix.patch text/x-diff 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2017-06-26 18:41:29 Re: Another reason why the recovery tests take a long time
Previous Message Andres Freund 2017-06-26 18:01:04 Re: Another reason why the recovery tests take a long time