Re: orangutan seizes up during isolation-check

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <davec(at)postgresintl(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: orangutan seizes up during isolation-check
Date: 2014-09-15 07:11:57
Message-ID: 5416913D.6040600@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/15/2014 07:51 AM, Noah Misch wrote:
> libintl replaces setlocale(). Its setlocale(LC_x, "") uses OS-specific APIs
> to determine the default locale when $LANG and similar environment variables
> are empty, as they are during "make check NO_LOCALE=1". On OS X, it calls[1]
> CFLocaleCopyCurrent(), which in turn spins up a thread. See the end of this
> message for the postmaster thread stacks active upon hitting a breakpoint set
> at _dispatch_mgr_thread.

Ugh. I'd call that a bug in libintl. setlocale() has no business to make
the process multi-threaded.

Do we have the same problem in backends? At a quick glance, aside from
postmaster we only use PG_SETMASK(&BlockSig) in signal handlers, to
prevent another signal handler from running concurrently.

> I see two options for fixing this in pg_perm_setlocale(LC_x, ""):
>
> 1. Fork, call setlocale(LC_x, "") in the child, pass back the effective locale
> name through a pipe, and pass that name to setlocale() in the original
> process. The short-lived child will get the extra threads, and the
> postmaster will remain clean.
>
> 2. On OS X, check for relevant environment variables. Finding none, set
> LC_x=C before calling setlocale(LC_x, ""). A variation is to raise
> ereport(FATAL) if sufficient environment variables aren't in place. Either
> way ensures the libintl setlocale() will never call CFLocaleCopyCurrent().
> This is simpler than (1), but it entails a behavior change: "LANG= initdb"
> will use LANG=C or fail rather than use the OS X user account locale.
>
> I'm skeptical of the value of looking up locale information using other OS X
> facilities when the usual environment variables are inconclusive, but I see no
> clear cause to reverse that decision now. I lean toward (1).

Both of those are horrible hacks. And who's to say that calling
setlocale(LC_x, "foo") won't also call some function that makes the
process multi-threaded. If not in any current OS X release, it might
still happen in a future one.

One idea would be to use an extra pthread mutex or similar, in addition
to PG_SETMASK(). Whenever you do PG_SETMASK(&BlockSig), also grab the
mutex, and release it when you do PG_SETMASK(&UnBlockSig).

It would be nice to stop doing non-trivial things in the signal handler
in the first place. It's pretty scary, even though it works when the
process is single-threaded. I believe the reason it's currently
implemented like that are the same problems that the latch code solves
with the self-pipe trick: select() is not interrupted by a signal on all
platforms, and even if it was, you would need pselect() with is not
available (or does not work correctly even if it exists) on all
platforms. I think we could use a latch in postmaster too.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Klyukin 2014-09-15 08:23:08 Re: implement subject alternative names support for SSL connections
Previous Message Mark Kirkwood 2014-09-15 06:09:15 Re: Postgres code for a query intermediate dataset