Re: Solaris testers wanted for strxfrm() behavior

From: Noah Misch <noah(at)leadboat(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: Solaris testers wanted for strxfrm() behavior
Date: 2015-06-30 05:57:41
Message-ID: 20150630055741.GA774647@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 29, 2015 at 11:52:26AM +1200, Thomas Munro wrote:
> On Mon, Jun 29, 2015 at 10:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> >> Just by the way, I wonder if this was that bug:
> >> https://illumos.org/issues/1594
> >
> > Oooh. Might or might not be *same* bug, but it sure looks like it could
> > have the right symptom. If this is indeed inherited from old Solaris,
> > I'm afraid we are totally fooling ourselves if we guess that it's no
> > longer present in the wild.

Very interesting. Looks like the illumos strxfrm() came from FreeBSD, not
from Solaris; illumos introduced their bug independently:

https://illumos.org/issues/2
https://github.com/illumos/illumos-gate/commits/master/usr/src/lib/libc/port/locale/collate.c

> Also, here is an interesting patch that went into the Apache C++
> standard library. Maybe the problem was limited to amd64 system...
>
> https://github.com/illumos/illumos-userland/blob/master/components/stdcxx/patches/047-collate.cpp.patch

That's a useful data point. Based on Oskari Saarenmaa's report, newer Solaris
10 is not affected. The fix presumably showed up after the 05/08 release and
no later than the 01/13 release.

On Sun, Jun 28, 2015 at 07:00:14PM -0400, Tom Lane wrote:
> > On Sun, Jun 28, 2015 at 12:58 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> >> My perspective is that if both SmartOS and OmniOS pass, it's not our
> >> responsibility to support OldSolaris if they won't update libraries.

> Another idea would be to make a test during postmaster start to see
> if this bug exists, and fail if so. I'm generally on board with the
> thought that we don't need to work on systems with such a bad bug,
> but it would be a good thing if the failure was clean and produced
> a helpful error message, rather than looking like a Postgres bug.

Failing cleanly on unpatched Solaris is adequate, agreed. A check at
postmaster start isn't enough, because the postmaster might run in the C
locale while individual databases or collations use problem locales. The
safest thing is to test after every setlocale(LC_COLLATE) and
newlocale(LC_COLLATE). That's once at backend start and once per backend per
collation used, more frequent than I would like. Hmm.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2015-06-30 06:14:43 Re: Reduce ProcArrayLock contention
Previous Message Simon Riggs 2015-06-30 05:30:26 Re: drop/truncate table sucks for large values of shared buffers