Quick Links

Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc:	Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
Date:	2017-09-25 21:49:37
Message-ID:	CAH2-Wzmx6YFHXyjUG7Bo+h6b0FCR-oZKvbB7OB=WR-BCrycHDg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Sep 25, 2017 at 12:52 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> That must have been the real reason why you canonicalized
> pg_collation.collname (I doubt it had anything to do with how keyword
> variants used to be created during initdb, as you suggested). As Tom
> pointed out recently, we've actually always canonicalized collation
> name for libc.

On further examination, none of this really matters, because you
simply cannot store ICU locale names like "en_US" within pg_collation;
it's impossible to do that without breaking many things that have
worked for a long time. initdb already canonicalizes the available
libc collations to produce collations whose names have exactly the
same "en_US" format. There will typically be both "en_US" and
"en_US.utf8" entries within pg_collation with Glibc on Linux, for example
(the former is created a convenient alias for the latter when the
database encoding is UTF-8).

--
Peter Geoghegan

In response to

Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it? at 2017-09-25 19:52:49 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Rady, Doug	2017-09-25 22:29:26	PATCH: pgbench - option to build using ppoll() for larger connection counts
Previous Message	Thomas Munro	2017-09-25 21:34:50	Re: Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().