From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Race condition in recovery? |
Date: | 2021-06-04 07:51:08 |
Message-ID: | CAFiTN-vJjo=i5OHi-PNQ7NwARAnVeqNvhKuco-cAi=apYD9Oxw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jun 4, 2021 at 2:03 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, May 27, 2021 at 2:26 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > Changed as suggested.
>
> I don't think the code as written here is going to work on Windows,
> because your code doesn't duplicate enable_restoring's call to
> perl2host or its backslash-escaping logic. It would really be better
> if we could use enable_restoring directly. Also, I discovered that the
> 'return' in cp_history_files should really say 'exit', because
> otherwise it generates a complaint every time it's run. It should also
> have 'use strict' and 'use warnings' at the top.
Ok
> Here's a version of your test case patch with the 1-line code fix
> added, the above issues addressed, and a bunch of cosmetic tweaks.
> Unfortunately, it doesn't pass for me consistently. I'm not sure if
> that's because I broke something with my changes, or because the test
> contains an underlying race condition which we need to address.
> Attached also are the log files from a failed run if you want to look
> at them. The key lines seem to be:
I could not reproduce this but I think I got the issue, I think I used
the wrong target LSN in wait_for_catchup, instead of checking the last
"insert LSN" of the standby I was waiting for last "replay LSN" of
standby which was wrong. Changed as below in the attached patch.
diff --git a/src/test/recovery/t/025_stuck_on_old_timeline.pl
b/src/test/recovery/t/025_stuck_on_old_timeline.pl
index 09eb3eb..ee7d78d 100644
--- a/src/test/recovery/t/025_stuck_on_old_timeline.pl
+++ b/src/test/recovery/t/025_stuck_on_old_timeline.pl
@@ -78,7 +78,7 @@ $node_standby->safe_psql('postgres', "CREATE TABLE
tab_int AS SELECT 1 AS a");
# Wait for the replication to catch up
$node_standby->wait_for_catchup($node_cascade, 'replay',
- $node_standby->lsn('replay'));
+ $node_standby->lsn('insert'));
# Check that cascading standby has the new content
my $result =
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Fix-corner-case-failure-of-new-standby-to-follow-.patch | text/x-patch | 5.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | tsunakawa.takay@fujitsu.com | 2021-06-04 08:04:27 | RE: Transactions involving multiple postgres foreign servers, take 2 |
Previous Message | Amit Langote | 2021-06-04 07:38:38 | Re: Skip partition tuple routing with constant partition key |