Re: [BUGS] Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Marc-Olaf Jaschke <marc-olaf(dot)jaschke(at)s24(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Date: 2016-08-23 00:36:31
Message-ID: CAM3SWZR8YQYP18VoHmeG-VsabGxWKkiCWO_Cc0bbHNDYyezmHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Mar 23, 2016 at 10:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Are you still in information-gathering more, or are you going to issue
>> a recommendation on how we should proceed here, or what?
>
> If I had to make a recommendation right now, I would go for your
> option #4, ie shut 'em all down Scotty. We do not know the full extent
> of the problem but it looks pretty bad, and I think our first priority
> has to be to guarantee data integrity. I do not have a lot of faith in
> the proposition that glibc's is the only buggy implementation, either.

For the record, I have been able to determine by using amcheck on the
Heroku platform that en_US.UTF-8 cases are sometimes affected by an
inconsistency between strcoll() and strxfrm() behavior, which was
previously an open question. I saw only two instances of this across
many thousands of servers. For some reason, both cases involved
strings with code points from the Arabic alphabet, even though each
case was from a totally unrelated customer database.

I'll go update the Wiki page for this [1] now.

[1] https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue
--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jim Nasby 2016-08-23 13:26:58 Re: Re: [BUGS] Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message Bruce Momjian 2016-08-23 00:15:41 Re: PgAdmin3 bug on application_name

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-08-23 00:54:58 Better locale-specific-character-class handling for regexps
Previous Message Craig Ringer 2016-08-23 00:31:26 Re: [PATCH] Transaction traceability - txid_status(bigint)