From: | Mats Kindahl <mats(at)timescale(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | daniel(at)yesql(dot)se, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Crash during backend start when low on memory |
Date: | 2022-12-16 08:46:39 |
Message-ID: | CA+14425E_-_nfS=PNXm3Rp7Z5wBmCWVJmOU1Ha_zBZyRytuv5A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Dec 15, 2022 at 6:20 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> On 2022-Dec-14, Mats Kindahl wrote:
>
> > PostmasterContext is set quite early in the procedure so it should not
> be a
> > problem to use it. I think we can use PostmasterContext rather than
> > TopMemoryContext. The memory for the allocations are passed down to
> > PostmasterMain() and used inside InitPostgres() but as far as I can tell,
> > they are not used after the InitPostgres() call. The PostmasterContex is
> > destroyed right after the call to InitPostgres() inside PostmasterMain()
> > and then the rest is allocated in other contexts.
>
> Well, this is what PostgresMain has to say about it:
>
> /*
> * If the PostmasterContext is still around, recycle the space; we
> don't
> * need it anymore after InitPostgres completes. Note this does
> not trash
> * *MyProcPort, because ConnCreate() allocated that space with
> malloc()
> * ... else we'd need to copy the Port data first. Also,
> subsidiary data
> * such as the username isn't lost either; see
> ProcessStartupPacket().
> */
> if (PostmasterContext)
> {
> MemoryContextDelete(PostmasterContext);
>
>
Ugh. Ok. Missed that comment.
> So I think blowing away part of struct Port independently of the
> containing struct is a bad idea. Why isn't more appropriate to do what
> ConnCreate does?
>
> if (!(port = (Port *) calloc(1, sizeof(Port))))
> {
> ereport(LOG,
> (errcode(ERRCODE_OUT_OF_MEMORY),
> errmsg("out of memory")));
> ExitPostmaster(1);
> }
>
> ... ugh. I have to admit that killing postmaster due to a transient
> out-of-memory makes me a bit nervous -- do note that this
> ExitPostmaster() happens before we've forked the backend, so this code
> does bring the server down (!!). If we want to make postmaster more
> robust in the face of OOM, we should also change this code somehow, yes?
>
Seems reasonable, yes. I'll see what I can throw together.
>
> I agree with Mats' premise when starting the thread -- even if the
> server is not in great shape, it doesn't seem like taking the whole
> service down is a great response. It would be better to just refuse the
> connection with some error and let existing connections chug along,
> poorly though they can. Maybe they'll also die soon, freeing memory;
> but that'll be no good if Postmaster is gone altogether.
>
> Now, it would be better if we could test this more thoroughly. Perhaps
> it's possible to use something like the mallocfail.c lib that Heikki
> wrote a few years ago, teaching it to handle calloc
> https://postgr.es/m/547480DE.4040408@vmware.com
Thanks. Will take a look at that.
>
> --
> Álvaro Herrera Breisgau, Deutschland —
> https://www.EnterpriseDB.com/
> "Oh, great altar of passive entertainment, bestow upon me thy discordant
> images
> at such speed as to render linear thought impossible" (Calvin a la TV)
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-12-16 14:49:09 | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |
Previous Message | Peter Eisentraut | 2022-12-16 08:07:36 | Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...' |