Re: [bug fix] Cascaded standby cannot start after a clean shutdown

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] Cascaded standby cannot start after a clean shutdown
Date: 2018-03-14 05:27:53
Message-ID: 20180314052753.GA16179@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Tue, Feb 27, 2018 at 05:15:29AM +0000, Tsunakawa, Takayuki wrote:
> That was my first thought, and I gave it up. As you say,
> XLogReadRecord() could allocate up to 1 GB of memory for a garbage.
> That allocation can fail due to memory shortage, which prevents the
> recovery from proceeding.

Even with that, the resulting patch is sort of ugly... So it seems to
me that the conclusion to this thread is that there is no clear winner,
and that the problem is so unlikely to happen that it is not worth the
performance impact to zero all the WAL pages fetched. Still, the
attached has the advantage to not cause the startup process to fail
suddendly because of the allocation failure but to fail afterwards when
validating the record's contents, which has the disadvantage to allocate
temporarily up to 1GB of memory for some of the garbage, but that
allocation is short-lived. So that would switch the failure from a
FATAL allocation failure taking down Postgres to something which will
fetching of new WAL records to be tried again.

All in all, that would be a win for the case reported on this thread.

Thoughts from anybody?
--
Michael

Attachment Content-Type Size
xlogreader-record-len-check.patch text/x-diff 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2018-03-14 05:28:54 Re: Faster inserts with mostly-monotonically increasing values
Previous Message Tatsuro Yamada 2018-03-14 04:55:18 Re: planner bug regarding lateral and subquery?