From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Cc: | pgsql-committers(at)postgresql(dot)org |
Subject: | Re: pgsql: Follow TLI of last replayed record, not recovery target TLI, in |
Date: | 2012-12-20 23:23:04 |
Message-ID: | CAHGQGwE9LRNpZxo6m6Hfkc4Nyaw7p5xCRhsx7uKWvjdxSSkaTQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On Thu, Dec 20, 2012 at 9:41 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)iki(dot)fi> wrote:
> Follow TLI of last replayed record, not recovery target TLI, in walsenders.
>
> Most of the time, the last replayed record comes from the recovery target
> timeline, but there is a corner case where it makes a difference. When
> the startup process scans for a new timeline, and decides to change recovery
> target timeline, there is a window where the recovery target TLI has already
> been bumped, but there are no WAL segments from the new timeline in pg_xlog
> yet. For example, if we have just replayed up to point 0/30002D8, on
> timeline 1, there is a WAL file called 000000010000000000000003 in pg_xlog
> that contains the WAL up to that point. When recovery switches recovery
> target timeline to 2, a walsender can immediately try to read WAL from
> 0/30002D8, from timeline 2, so it will try to open WAL file
> 000000020000000000000003. However, that doesn't exist yet - the startup
> process hasn't copied that file from the archive yet nor has the walreceiver
> streamed it yet, so walsender fails with error "requested WAL segment
> 000000020000000000000003 has already been removed". That's harmless, in that
> the standby will try to reconnect later and by that time the segment is
> already created, but error messages that should be ignored are not good.
>
> To fix that, have walsender track the TLI of the last replayed record,
> instead of the recovery target timeline. That way walsender will not try to
> read anything from timeline 2, until the WAL segment has been created and at
> least one record has been replayed from it. The recovery target timeline is
> now xlog.c's internal affair, it doesn't need to be exposed in shared memory
> anymore.
>
> This fixes the error reported by Thom Brown. depesz the same error message,
> but I'm not sure if this fixes his scenario.
You seem to have forgotten to remove the following line from xlog.h.
src/include/access/xlog.h:312:extern TimeLineID GetRecoveryTargetTLI(void);
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2012-12-21 05:28:45 | pgsql: Fix grammatical mistake in error message |
Previous Message | Tom Lane | 2012-12-20 21:32:20 | pgsql: Fix pg_extension_config_dump() to handle update cases more sanel |