Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Marc-Olaf Jaschke <marc-olaf(dot)jaschke(at)s24(dot)com>
Subject: Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Date: 2016-03-23 18:56:53
Message-ID: CA+TgmoZNe36W_FjUDFj09Js=ivxhnqLBip=1-iFsFn78j5Kn9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Mar 23, 2016 at 2:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> On Wed, Mar 23, 2016 at 11:04 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> That said, my main point is that I do not think the knob is something that
>>> should be tuned by the average end user. For most people, that should be
>>> left to the packagers for the platform, who can make an informed choice
>>> about if it's safe to turn it on.
>
>> I could get behind that if we really make an effort to help them make
>> an informed choice. The abbreviated keys optimization is highly
>> valuable, and I put a lot of work into it, as did Robert.
>
> I realize that, and I'm sympathetic, but I'm afraid it also means that
> your judgment in this matter is rather biased.
>
> I do not think that end users can be expected to know whether this is safe
> to turn on, and TBH I do not think that most packagers will either. My
> opinion is that our only guaranteed-safe option is to turn it off, period,
> no exceptions for platforms that we've not yet found a failure case for.
> We can consider turning it back on later, once we've done vastly more
> study and testing than has evidently been done to date. One thing I'm
> going to want to know is what was the root cause of glibc's bug, and what
> is the reason to think that other implementations are going to be any more
> reliable. At this point I'm disinclined to trust any implementation that
> can't point to a structural reason (e.g., sharing code) to believe that
> strcoll and strxfrm must yield equivalent answers.
>
> (In other words, I want an #ifdef NOT_USED, which is even less effort
> than either a GUC or a configure option ;-(. As well as being something
> that we won't need to document and support indefinitely.)

I think that something like the attached would be a reasonable
approach to the problem. If we later decide this is altogether
hopeless, we can do a more thorough job removing the code that can be
reached when collate_c && abbreviate, but let's not do that right now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
dont-trust-strxfrm.patch text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-23 19:01:36 Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message Peter Geoghegan 2016-03-23 18:40:22 Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-23 19:01:36 Re: Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
Previous Message Petr Jelinek 2016-03-23 18:52:24 Re: Relation extension scalability