Quick Links

Re: Anti-critical-section assertion failure in mcxt.c reached by walsender

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject:	Re: Anti-critical-section assertion failure in mcxt.c reached by walsender
Date:	2021-05-07 19:49:47
Message-ID:	20210507194947.etrgj7mpcv73mxef@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2021-05-07 10:29:58 -0400, Tom Lane wrote:
> I wrote:
> > 1. No wonder we could not reproduce it anywhere else. I've warned
> > the cfarm admins that their machine may be having hardware issues.
>
> I heard back from the machine's admin. The time of the crash I observed
> matches exactly to these events in the kernel log:
>
> May 07 03:31:39 gcc202 kernel: dm-0: writeback error on inode 2148294407, offset 0, sector 159239256
> May 07 03:31:39 gcc202 kernel: sunvdc: vdc_tx_trigger() failure, err=-11
> May 07 03:31:39 gcc202 kernel: blk_update_request: I/O error, dev vdiskc, sector 157618896 op 0x1:(WRITE) flags 0x4800 phys_seg 16 prio class 0
>
> So it's not a mirage. The admin seems to think it might be a kernel
> bug though.

Isn't this a good reason to have at least some tests run with fsync=on?

It makes a ton of sense for buildfarm animals to disable fsync to
achieve acceptable performance. Having something in there that
nevertheless does some light exercise of the fsync code doesn't seem
bad?

Greetings,

Andres Freund

In response to

Re: Anti-critical-section assertion failure in mcxt.c reached by walsender at 2021-05-07 14:29:58 from Tom Lane

Responses

Re: Anti-critical-section assertion failure in mcxt.c reached by walsender at 2021-05-07 20:30:00 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Rowley	2021-05-07 19:56:34	Re: plan with result cache is very slow when work_mem is not enough
Previous Message	Andrew Dunstan	2021-05-07 19:47:28	Re: Anti-critical-section assertion failure in mcxt.c reached by walsender