Re: Crash by targetted recovery

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Crash by targetted recovery
Date: 2020-03-10 05:59:00
Message-ID: 20200310.145900.21675815166709370.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 10 Mar 2020 10:50:52 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> Pushed the v5-0001-Tidy-up-XLogSource-usage.patch!

Thanks!

> Regarding the remaining patch adding the regression test,

I didn't seriously inteneded it to be in the tree.

> +$result =
> + $node_standby->safe_psql('postgres', "SELECT
> pg_last_wal_replay_lsn()");
> +my ($seg, $off) = split('/', $result);
> +my $target = sprintf("$seg/%08X", (hex($off) / $segsize + 1) *
> $segsize);
>
> What happens if "off" part gets the upper limit and "seg" part needs
> to be incremented? What happens if pg_last_wal_replay_lsn() advances
> very much (e.g., because of autovacuum) beyond the segment boundary
> until the standby restarts? Of course, these situations very rarely
> happen,
> but I'd like to avoid adding such not stable test if possible.

In the first place the "seg" is "fileno". Honestly I don't think the
test doesn't reach to fileno boundary but I did in the attached. Since
perl complains over-32bit integer arithmetic as incomptible, the
calculation gets a bit odd shape to avoid over-32bit arithmetic.

For the second point, which seems more likely to happen, I added the
VACUUM/pg_switch_wal() sequence then wait standby for catch up, before
doing the test.

Does it make sense?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v5-0001-TAP-test-for-a-crash-bug.patch text/x-patch 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-03-10 06:11:21 Re: shared-memory based stats collector
Previous Message Masahiko Sawada 2020-03-10 05:58:01 Re: logical copy_replication_slot issues