Re: Incorrect handling of OOM in WAL replay leading to data loss

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, ethmertz(at)amazon(dot)com, nathandbossart(at)gmail(dot)com, pgsql(at)j-davis(dot)com, sawada(dot)mshk(at)gmail(dot)com
Subject: Re: Incorrect handling of OOM in WAL replay leading to data loss
Date: 2023-10-03 07:20:45
Message-ID: ZRvAzUi9WzdE2tcp@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 26, 2023 at 03:48:07PM +0900, Michael Paquier wrote:
> By the way, anything that I am proposing here cannot be backpatched
> because of the infrastructure changes required in walreader.c, so I am
> going to create a second thread with something that could be
> backpatched (yeah, likely FATALs on OOM to stop recovery from doing
> something bad)..

Patch set is rebased as an effect of 6b18b3fe2c2f, that switched the
OOMs to fail harder now in xlogreader.c. The patch set has nothing
new, except that 0001 is now a revert of 6b18b3fe2c2f to switch back
xlogreader.c to use soft errors on OOMs.

If there's no interest in this patch set after the next CF, I'm OK to
drop it. The state of HEAD is at least correct in the OOM cases now.
--
Michael

Attachment Content-Type Size
v5-0001-Revert-Fail-hard-on-out-of-memory-failures-in-xlo.patch text/x-diff 4.0 KB
v5-0002-Add-infrastructure-to-report-error-codes-in-WAL-r.patch text/x-diff 41.9 KB
v5-0003-Make-WAL-replay-more-robust-on-OOM-failures.patch text/x-diff 4.5 KB
v5-0004-Tweak-to-force-OOM-behavior-when-replaying-record.patch text/x-diff 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2023-10-03 07:24:47 Re: Trigger violates foreign key constraint
Previous Message David Rowley 2023-10-03 07:16:07 Re: pg16: XX000: could not find pathkey item to sort