Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
Date: 2017-09-25 16:06:19
Message-ID: CA+TgmoaFKyk-iDaEbfNT=xeeKu4QQ8Xnm_iARqYZ5fH==K50uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 22, 2017 at 11:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> FWIW, the release is a week from Monday, not Monday. (Or if it is
> Monday, somebody else is wrapping it.)

Oops.

> We have some other embarrassingly critical things to fix, like bug #14825,
> so I can certainly sympathize with an argument that there's not enough
> committer bandwidth left to deal with this; but not with an argument that
> it's too late to change behavior period.

Really? Traditionally, you've been one of the biggest opponents of
whacking things around post-beta. Even post-beta1, let alone
post-beta5.

> The big concern I have here is that this feels a lot like something that
> we'll regret at leisure, if it's not right in the first release. I'd
> much rather be restrictive in v10 and then loosen the rules later, than
> be lax in v10 and then have to argue about whether to break backwards
> compatibility in order to gain saner behavior.

I think it's inevitable that a certain number of users are going to
have to cope with ICU version changes breaking stuff. If ICU decides
a collation is stupid or unused and drops it, or is mis-defined and
redefines it to some behavior that breaks things for somebody, they
are going to have to deal with it. I don't think you can make that
problem go away by any amount of strictness introduced into v10, but
if you make the checks zealous enough, you can probably make them rule
out input that users would have preferred to have accepted.

I also think that if there's a compelling reason to bet on BCP 47 to
be a stable canonical form, I haven't heard it presented here. At the
risk of repeating myself, it's not even supported in some ICU versions
we support, so how's that going to work? And if it's been changed in
the recent past, why not again? Peter Geoghegan said that he doesn't
know of any plans to eliminate BCP 47 support, but that doesn't seem
like it's much proof of anything.

>> I simply do not buy the theory that this cannot be changed later.
>
> OK, so you're promising not to whine when we break backwards compatibility
> on this point in v11?

If somebody has a collation that appears to work on v10 but really is
doing nothing, and when the upgrade to v11 they get an error because
we diagnose that the collation definition was not valid whereas v10
was unable to make that diagnosis, I promise not to whine about the
backward compatibility break thereby introduced.

If, on the other hand, you introduce an error check that is overly
stringent and precludes people from defining collations that are legal
and useful (in their judgement, not yours), I intend to whine about
that extensively. And that applies to 10, 11, and any future versions
for which I may be around.

In short, I judge that allowing users access to *all* of the things
that ICU has now, has had in the past in versions we support, or may
have in the future is an important goal, but that preventing them from
relying on options that may go away is not a goal at all, since
barring the ability to predict the future, it's impossible anyway.

If it's possible to prevent to write a precisely-targeted check that
rules out only actually-meaningless collations and nothing else, I'm
fine with that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-09-25 16:11:18 psql \d sequence display
Previous Message chenhj 2017-09-25 15:26:58 Re: [PATCH]make pg_rewind to not copy useless WAL files