Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
Date: 2017-09-19 21:19:54
Message-ID: 7824354d-cefe-8cc7-01c5-6812e7611eb5@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/19/2017 12:46 AM, Peter Geoghegan wrote:> At one point a couple of
months back, it was understood that
> get_icu_language_tag() might not always work with (assumed) valid
> locale names -- that is at least the impression that the commit
> message of eccead9 left me with. But, that was only with ICU 4.2, and
> in any case we've since stopped creating keyword variants at initdb
> time for other reasons (see 2bfd1b1 for details of those other
> reasons). I tend to think that we should not install any language tag
> that uloc_toLanguageTag() does not accept as valid on general
> principle (so not just at initdb time, when it's actually least
> needed).
>
> Thoughts? I can write a patch for this, if that helps. It should be
> straightforward.

Hm, I like the idea but I see some issues.

Enforcing the BCP47 seems like a good thing to me. I do not see any
reason to allow input with syntax errors. The issue though is that we do
not want to break people's databases when they upgrade to PostgreSQL 11.
What if they have specified the locale in the old non-ICU format or they
have a bogus value and we then error out on pg_upgrade or pg_restore?

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-19 21:20:49 Re: src/test/subscription/t/002_types.pl hanging on particular environment
Previous Message Thomas Munro 2017-09-19 21:16:13 Re: src/test/subscription/t/002_types.pl hanging on particular environment