Re: narwhal and PGDLLIMPORT

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dave Page <dpage(at)pgadmin(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: narwhal and PGDLLIMPORT
Date: 2014-10-21 03:46:28
Message-ID: 20141021034628.GA282401@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 20, 2014 at 10:24:47PM +0200, Andres Freund wrote:
> On 2014-10-20 01:03:31 -0400, Noah Misch wrote:
> > On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote:
> > I happened to try the same contrib/dblink test suite on PostgreSQL built with
> > modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave
> > a crash-like symptom starting with commit 846e91e. Specifically, a backend
> > that LOADed any module linked to libpq (libpqwalreceiver, dblink,
> > postgres_fdw) would suffer this after calling exit(0):
> >
> > ===
> > 3056 2014-10-20 00:40:15.163 GMT LOG: disconnection: session time: 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1 port=3936
> >
> > This application has requested the Runtime to terminate it in an unusual way.
> > Please contact the application's support team for more information.
> >
> > This application has requested the Runtime to terminate it in an unusual way.
> > Please contact the application's support team for more information.
> > 9300 2014-10-20 00:40:15.163 GMT LOG: server process (PID 3056) exited with exit code 3
> > ===
> >
> > The mechanism turned out to be disjoint from the mechanism behind the
> > ancient-compiler crash. Based on the functions called from exit(), my best
> > guess is that exit() encountered recursion and used something like an abort()
> > to escape.
>
> Hm.
>
> > (I can send the gdb transcript if anyone is curious to see the
> > gory details.)
>
> That would be interesting.

Attached. ("rep 100 s" calls a macro equivalent to issuing "s" 100 times.)

> > The proximate cause was commit 846e91e allowing modules to use
> > shared libgcc. A 32-bit libpq acquires 64-bit integer division from libgcc.
> > Passing -static-libgcc to the link restores the libgcc situation as it stood
> > before commit 846e91e. The main beneficiary of shared libgcc is C++/Java
> > exception handling, so PostgreSQL doesn't care. No doubt there's some deeper
> > bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc
> > should not disrupt exit(). I'm content with this workaround.
>
> I'm unconvinced by this reasoning. Popular postgres extensions like
> postgis do use C++. It's imo not hard to imagine situations where
> switching to a statically linked libgcc statically could cause problems.

True; I was wrong to say that PostgreSQL doesn't care. MinGW builds of every
released PostgreSQL version use static libgcc. That changed as an unintended
consequence of a patch designed to remove our reliance on dlltool. Given the
lack of complaints about our historic use of static libgcc, I think it's fair
to revert to 9.3's use thereof. Supporting shared libgcc would be better
still, should someone make that effort.

Attachment Content-Type Size
libgcc-exit-trace.txt text/plain 4.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-10-21 03:47:44 Re: pg_basebackup fails with long tablespace paths
Previous Message Jim Nasby 2014-10-21 01:26:05 Re: Inconsistencies in documentation of row-level locking