Re: Incorrect handling of OOM in WAL replay leading to data loss

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: michael(at)paquier(dot)xyz
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, ethmertz(at)amazon(dot)com, nathandbossart(at)gmail(dot)com, pgsql(at)j-davis(dot)com, sawada(dot)mshk(at)gmail(dot)com
Subject: Re: Incorrect handling of OOM in WAL replay leading to data loss
Date: 2023-08-01 06:28:54
Message-ID: 20230801.152854.605125182959292988.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 1 Aug 2023 14:03:36 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> On Tue, Aug 01, 2023 at 01:51:13PM +0900, Kyotaro Horiguchi wrote:
> > I believe a database server is not supposed to be executed under such
> > a memory-constrained environment.
>
> I don't really follow this argument. The backend and the frontends
> are reliable on OOM, where we generate ERRORs or even FATALs depending
> on the code path involved. A memory bounded environment is something
> that can easily happen if one's not careful enough with the sizing of

I didn't meant that OOM should not happen. I mentioned an environemnt
where allocation failure can happen while crash recovery. Anyway I
didn't meant that we shouldn't "fix" it.

> the instance. For example, this error can be triggered on a standby
> with read-only queries that put pressure on the host's memory.

I thoght that the failure on a stanby results in continuing to retry
reading the next record. However, I found that there's a case where
start process stops in response to OOM [1].

> > One issue on changing that behavior is that there's not a simple way
> > to detect a broken record before loading it into memory. We might be
> > able to implement a fallback mechanism for example that loads the
> > record into an already-allocated buffer (which is smaller than the
> > specified length) just to verify if it's corrupted. However, I
> > question whether it's worth the additional complexity. And I'm not
> > sure what if the first allocation failed.
>
> Perhaps we could rely more on a fallback memory, especially if it is
> possible to use that for the header validation. That seems like a
> separate thing, still.

Once a record have been read, that size of memory is already
allocated.

While we will not agree, we could establish a defalut behavior where
an OOM during recovery immediately triggers an ERROR. Then, we could
introduce a *GUC* that causes recovery to regard OOM as an
end-of-recovery error.

regards.

[1] https://www.postgresql.org/message-id/17928-aa92416a70ff44a2%40postgresql.org

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-08-01 06:32:29 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message Masahiro Ikeda 2023-08-01 06:23:54 Fix pg_stat_reset_single_table_counters function