Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Marc-Olaf Jaschke <marc-olaf(dot)jaschke(at)s24(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Date: 2016-03-23 14:47:07
Message-ID: CA+TgmoahtpvHZbtE=Gapq=HvA-N0y5qgnfMJWF0hXE9ibJGXzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Mar 22, 2016 at 10:44 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Mar 22, 2016 at 07:19:44PM -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> > I was a little worried that it was too much to hope for that all libc
>> > vendors on earth would ship a strxfrm() implementation that was actually
>> > consistent with strcoll(), and here we are.
>>
>> Indeed. To try to put some scope on the problem, I made an idiot little
>> program that just generates some random UTF8 strings and sees whether
>> strcoll and strxfrm sort them alike. Attached are that program, a even
>> more idiot little shell script that runs it over all available UTF8
>> locales, and the results on my RHEL6 box. While de_DE seems to be the
>> worst-broken locale, it's far from the only one.
>>
>> Please try this on as many platforms as you can get hold of ...
>
> I, too, found MAXXFRMLEN insufficient; I raised it fourfold. Cygwin
> 2.2.1(0.289/5/3) caught fire; 10% of locales passed. (varstr_sortsupport()
> already blacklists the UTF8/native Windows case.) The test passed on Solaris
> 10, Solaris 11, HP-UX B.11.31, OpenBSD 5.0, NetBSD 5.1.2, and FreeBSD 9.0.
> See attached tryalllocales.sh outputs. I did not test AIX, because the AIX
> machines I use have no UTF8 locales installed.

Wow, thanks for the extensive testing. This suggests that, apart from
Cygwin which apparently doesn't matter right now, the only thing that
is busted is glibc. I believe we have yet to see a single locale that
fails anywhere else (apart from Cygwin). Good thing so few of our
users run glibc!

Ha ha, little joke there.

So, options:

1. We could make it the user's problem to figure out whether they've
got a buggy glibc and add a GUC to shut this off, as previously
suggested.

2. We could add a blacklist (either hardcoded or a GUC) shutting this
off for locales known to be buggy anywhere.

3. We could write some test code that runs at startup time which
reliably detects all of the broken locales we've so far uncovered and
disables this if so.

4. We could shut this off for all Linux users in all locales and tell
everybody to REINDEX. That would be pretty sad, though.

Thoughts? Other ideas?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-03-23 15:43:56 Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message jako.andras 2016-03-23 11:05:19 BUG #14041: transaction_read_only documentation

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-23 15:02:08 Re: Patch: fix lock contention for HASHHDR.mutex
Previous Message Eduardo Morras 2016-03-23 14:42:25 Re: [PROPOSAL] Add SCTP network protocol to postgresql backend and frontend