Re: Crash during backend start when low on memory

From: Mats Kindahl <mats(at)timescale(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, daniel(at)yesql(dot)se, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Crash during backend start when low on memory
Date: 2023-02-28 20:39:01
Message-ID: CA+14424atYYrCh9rf+VYHWB_uAQoGr_JsrZjVvX2LnNerXpipQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Feb 6, 2023 at 7:41 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2023-02-06 08:37:17 +0100, Mats Kindahl wrote:
> > On Mon, Feb 6, 2023 at 12:18 AM Andres Freund <andres(at)anarazel(dot)de>
> wrote:
> > > >, but I cannot see how that can cause problems since it is "just"
> > > > a setjmp() and longjmp(). It allocates a structure on the stack that
> > > > contains the values of the registers and the signal set; it is big,
> but
> > > not
> > > > exceedingly so. If somebody could enlighten me about if there are any
> > > > problems with this, I would be very grateful.
> > >
> > > If you aren't careful you end up with PG_exception_stack in a backend
> > > still pointing to that PG_TRY. Which means that an error during backend
> > > startup would jump into postmaster code. Which would not be good.
> > >
> >
> > I did run into this case when "failing" some of the memory allocations
> > inside the backend with a debugger (see the PATCH.md).
> >
> > There are, however, already cases in the backend startup that can raise
> an
> > error (for example, the code that creates the MessageContext in
> > PostgresMain). So, either if one of these error cases is triggered, or if
> > you make a mistake and add an ereport(ERROR,...) at the wrong place
> during
> > startup, it could cause this problem. In other words, wouldn't adding a
> > PG_TRY() in the code spawning the backend *prevent* such a problem and
> > hence reduce the risk of a mistake in code causing the backend jumping
> into
> > postmaster code (of course assuming that the code is written correctly).
>
> It's fine to add ereport(ERROR)s inside the backend startup - elog.c knows
> how
> to deal with no error handlers being set up. We just promote the ERROR to a
> FATAL. That's not at all the same as having a PG_exception_stack set up at
> a
> point we don't want to return to.
>

Remember that backends are forked, so they inherit the stack that postmaster
> has set up.
>

>
> > > Particularly because this is a bugfix we need to make the fix more
> > > minimal than the patches proposed so far.
> > >
> >
> > I can remove the PG_TRY() and use ereport(LOG, ...) + ExitPostmaster if
> it
> > feels like a safer path for incorporating a bug fix, but note that there
> is
> > still a risk that the backend will tear down the postmaster. For example,
> > allocating memory for a memory context can fail, or throwing another
> error
> > inside the backend startup can also fail, so I think that there will be
> > more work to do after this.
>
> I can't really follow.
>

My apologies, I'll add my reasoning below.

However, since you wanted a patch not using PG_TRY for the bugfix I have
attached a patch that does not use it and instead uses memory context
functions and the MCXT_ALLOC_NO_OOM flag where applicable. I've tested it
manually similarly to the other patches and it passes "make check" and
"make installcheck". I have added comments to explain my reasoning for the
important allocations: what memory context it is allocated in and why.

My reasoning was that using a PG_TRY around the call to BackendStartup()
would prevent any errors generated now and in future code from killing the
postmaster, so it would be a more safe alternative. The difference in size
is not that big between the patches, but this patch does not use exception
handling.

Best wishes,
Mats Kindahl

> Greetings,
>
> Andres Freund
>

Attachment Content-Type Size
0001-Don-t-crash-on-out-of-memory-when-starting-backend.patch text/x-patch 4.0 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Keyerror Smart 2023-03-01 02:01:29 Re: BUG #17812: LOCK TABLE IN ACCESS EXCLUSIVE MODE with a view returns an empty tuple set
Previous Message Tom Lane 2023-02-28 19:10:07 Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)