Re: fairywren failures

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: fairywren failures
Date: 2019-10-03 15:23:49
Message-ID: 20191003152349.b4wmzz7qyy6tnmm2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-10-03 08:18:42 -0700, Andres Freund wrote:
> On 2019-10-03 10:21:13 -0400, Andrew Dunstan wrote:
> > My new msys2 animal fairywren has had 3 recent failures when checking
> > pg_upgrade. The failures have been while running the regression tests,
> > specifically the interval test, and they all look like this:
> >
> >
> > 2019-10-03 05:36:00.373 UTC [24272:43] LOG: server process (PID 23756) was terminated by exception 0xC0000028
> > 2019-10-03 05:36:00.373 UTC [24272:44] DETAIL: Failed process was running: INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted interval');
> >
> >
> > That error is "bad stack"
>
> > The failures have been on REL_12_STABLE (twice) and master (once).
> > However, they are not consistent (REL_!2_STABLE is currently green).
> >
> >
> > The interval test itself hasn't changed for m ore than 2 years, and I
> > haven't found any obvious recent change that might cause the problem. I
> > guess it could be a comoiler bug ... this is gcc 9.2.0, which is the
> > current release.
>
> This is around where an error is thrown:
> -- badly formatted interval
> INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted interval');
> -ERROR: invalid input syntax for type interval: "badly formatted interval"
> -LINE 1: INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted inter...
> - ^
>
> and the error is stack related. So I suspect that setjmp/longjmp might
> be to blame here, and somehow don't save/restore the stack into a proper
> state. I don't know enough about mingw/msys/windows to know whether that
> uses a self-written setjmp or relies on the MS implementation.
>
> If you could gather a backtrace it might help us. It's possible that the
> stack is "just" misaligned or something, we had problems with that
> before (IIRC valgrind didn't always align stacks correctly for processes
> that forked from within a signal handler, which then crashed when using
> instructions with alignment requirements, but only sometimes, because
> the stack coiuld be aligned).

It seems we're not the only ones hitting this:
https://rt.perl.org/Public/Bug/Display.html?id=133603

Doesn't look like they've really narrowed it down that much yet.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-10-03 15:24:37 Re: Regarding extension
Previous Message Tomas Vondra 2019-10-03 15:20:03 Re: Transparent Data Encryption (TDE) and encrypted files