From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Noah Misch <noah(at)leadboat(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: Anti-critical-section assertion failure in mcxt.c reached by walsender |
Date: | 2021-05-08 02:18:14 |
Message-ID: | 161637.1620440294@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2021-05-07 17:14:18 -0700, Noah Misch wrote:
>> Having a flaky buildfarm member is bad news. I'll LD_PRELOAD the attached to
>> prevent fsync from reaching the kernel. Hopefully, that will make the
>> hardware-or-kernel trouble unreachable. (Changing 008_fsm_truncation.pl
>> wouldn't avoid this, because fsync=off doesn't affect syncs outside the
>> backend.)
> Not sure how reliable that is - there's other paths that could return an
> error, I think. If the root cause is the disk responding weirdly to
> write cache flushes, you could tell the kernel that that the disk has no
> write cache (e.g. echo write through > /sys/block/sda/queue/write_cache).
I seriously doubt Noah has root on that machine.
More to the point, the admin told me it's a VM (or LDOM, whatever that is)
under a Solaris host, so there's no direct hardware access going on
anyway. He didn't say in so many words, but I suspect the reason he's
suspecting kernel bugs is that there's nothing going wrong so far as the
host OS is concerned.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2021-05-08 02:23:08 | Re: Have I found an interval arithmetic bug? |
Previous Message | Andres Freund | 2021-05-08 02:08:39 | Re: Anti-critical-section assertion failure in mcxt.c reached by walsender |