Re: Anti-critical-section assertion failure in mcxt.c reached by walsender

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Date: 2021-05-07 03:50:48
Message-ID: 20210507035048.ocjxgk664lnryico@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-05-06 21:43:32 -0400, Tom Lane wrote:
> 2. We evidently need to put a bit more effort into this error
> reporting logic. More generally, I wonder how we could audit
> the code for similar hazards elsewhere, because I bet there are
> some. (Or ... could it be sane to run functions included in
> the ereport's arguments in ErrorContext?)

I have wondered about that before myself. It's pretty awkward to solve
these kind of things at the caller level, and we have a perfectly good
context to do this in, that we know is going to be reset. However - we
don't reset ErrorContext for DEBUG messages, I believe. So there'd be a
noticeable increase in leaking into ErrorContext, unless we change how
we do that?

I guess I could see only switching to another memory context for >=
ERROR, but it does seem a bit odd. But for PANIC etc it's quite annoying
to loose the actual error message on the buildfarm.

> > Unfortunately there is no libbacktrace in that release, and for some
> > reason we don't see a core being analysed... (gdb not installed,
> > looking for wrong core file pattern, ...?)
>
> That I'm not sure about. gdb is certainly installed, and thorntail is
> visibly running the current buildfarm client and is configured with the
> correct core_file_glob, and I can report that the crash did leave a 'core'
> file in the data directory (so it's not a case of systemd commandeering
> the core dump). Seems like core-file collection should've worked
> ... unless maybe it's not covering TAP tests at all?

I suspect that is it - there's not really a good way for the buildfarm
client to even know where there could be data directories :(.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-05-07 03:55:10 Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Previous Message Pengchengliu 2021-05-07 03:32:57 Parallel scan with SubTransGetTopmostTransaction assert coredump