From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: fairywren failures |
Date: | 2019-10-03 16:17:52 |
Message-ID: | 20191003161752.ylp3ppdry2onhiua@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-10-03 08:23:49 -0700, Andres Freund wrote:
> On 2019-10-03 08:18:42 -0700, Andres Freund wrote:
> > This is around where an error is thrown:
> > -- badly formatted interval
> > INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted interval');
> > -ERROR: invalid input syntax for type interval: "badly formatted interval"
> > -LINE 1: INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted inter...
> > - ^
> >
> > and the error is stack related. So I suspect that setjmp/longjmp might
> > be to blame here, and somehow don't save/restore the stack into a proper
> > state. I don't know enough about mingw/msys/windows to know whether that
> > uses a self-written setjmp or relies on the MS implementation.
> >
> > If you could gather a backtrace it might help us. It's possible that the
> > stack is "just" misaligned or something, we had problems with that
> > before (IIRC valgrind didn't always align stacks correctly for processes
> > that forked from within a signal handler, which then crashed when using
> > instructions with alignment requirements, but only sometimes, because
> > the stack coiuld be aligned).
>
> It seems we're not the only ones hitting this:
> https://rt.perl.org/Public/Bug/Display.html?id=133603
>
> Doesn't look like they've really narrowed it down that much yet.
A few notes:
* As an experiment, it could be worthwhile to try to redefine
sigsetjmp/longjmp/sigjmp_buf with what
https://gcc.gnu.org/onlinedocs/gcc/Nonlocal-Gotos.html
provides, it's apparently a separate implementation from MS crt one.
* Arguably
"Do not use longjmp to transfer control from a callback routine
invoked directly or indirectly by Windows code."
and
"Do not use longjmp to transfer control out of an interrupt-handling
routine unless the interrupt is caused by a floating-point
exception. In this case, a program may return from an interrupt
handler via longjmp if it first reinitializes the floating-point math
package by calling _fpreset."
from https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/longjmp?view=vs-2019
might be violated by our signal signal emulation on windows. But I've
not looked into that in detail.
* Any chance you could get the pre-processed source for postgres.c or
such? I'm kinda wondering if the definition of setjmp() that we get
includes the returns_twice attribute that gcc wants to see, and
whether we're picking up the mingw version of longjmp, or the windows
one.
* It's certainly curious that the failures so far only have happended as
part of pg_upgradeCheck, rather than the plain regression tests.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-10-03 16:20:14 | Re: Improving on MAX_CONVERSION_GROWTH |
Previous Message | Tom Lane | 2019-10-03 16:12:40 | Re: Improving on MAX_CONVERSION_GROWTH |