Re: Multi-byte character case-folding

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-byte character case-folding
Date: 2020-07-07 00:32:22
Message-ID: 1479731.1594081942@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2020-Jul-06, Tom Lane wrote:
>> More generally, I'd be mighty hesitant to change this behavior after
>> it's stood for so many years. I suspect more people would complain
>> that we broke their application than would be happy about it.

> I think the fact that identifiers fail to follow language-specific case
> folding rules is more a known gotcha than a desired property, but on
> principle I tend to agree that Turkish people would not be happy about
> the prospect of us changing the downcasing rule in a major release -- it
> would mean having to edit any affected application code as part of a
> pg_upgrade process, which is not great.

It's not just the Turks. As near as I can tell, we'd likely break *every*
app that's using such identifiers. For example, supposing I do

test=# create table MYÉCLASS (f1 text);
CREATE TABLE
test=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------+-------+----------
public | myÉclass | table | postgres
(1 row)

pg_dump will render this as

CREATE TABLE public."myÉclass" (
f1 text
);

If we start to case-fold É, then the only way to access this table will
be by double-quoting its name, which the application probably is not
expecting (else it would have double-quoted in the original CREATE TABLE).

> Now you could say that this can be fixed by adding a GUC that preserves
> the old behavior, but generally we don't like that too much.

Yes, a GUC changing this would be a headache. It would be just as much of
a compatibility and security hazard as standard_conforming_strings (which
indeed I've been thinking of proposing that we get rid of; it's hung
around long enough).

> The counter argument is that there are more future users than there are
> current users.

Especially if we drive away the current users :-(. In practice, we've
heard very very few complaints about this, so my gut says to leave
it alone.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-07-07 00:54:36 Re: min_safe_lsn column in pg_replication_slots view
Previous Message Dave Cramer 2020-07-07 00:16:50 Re: Binary support for pgoutput plugin