Re: Move defaults toward ICU in 16?

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Move defaults toward ICU in 16?
Date: 2023-02-15 19:31:32
Message-ID: 50453e6c25c69fbc30fc3da1ff59bd6f41953b07.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2023-02-14 at 16:27 -0500, Jonathan S. Katz wrote:
> Would it be any different than a vulnerability in OpenSSL et al?

In principle, no, but sometimes the details matter. I'm just trying to
add data to the discussion.

> It seems that
> in general, users would see performance gains switching to ICU.

That's great news, and consistent with my experience. I don't think it
should be a driving factor though. If there's a choice is between
platform-independent semantics (ICU) and performance, platform-
independence should be the default.

> I agree with most of your points in [1]. The platform-consistent
> behavior is a good point, especially with more PG deployments running
> on
> different systems.

Now I think semantics are the most important driver, being consistent
across platforms and based on some kind of trusted independent
organization that we can point to.

It feels very wrong to me to explain that sort order is defined by the
operating system on which Postgres happens to run. Saying that it's
defined by ICU, which is part of the Unicode consotium, is much better.
It doesn't eliminate versioning issues, of course, but I think it's a
better explanation for users.

Many users have other systems in their data infrastructure, running on
a variety of platforms, and could (in theory) try to synchronize around
a common ICU version to avoid subtle bugs in their data pipeline.

> Based on the available data, I think it's OK to move towards ICU as
> the
> default, or preferred, collation provider. I agree (for now) in not
> taking a hard dependency on ICU.

I count several favorable responses, so I'll take it that we (as a
community) are intending to change the default for build and initdb in
v16.

Robert expressed some skepticism[1], though I don't see an objection.
If I read his concerns correctly, he's mainly concerned with quality
issues like documentaiton, bugs, etc. I understand those concerns (I'm
the one that raised them), but they seem like the kind of issues that
one finds any time they dig into a dependency enough. "Setting our
sights very high"[1], to me, would just be ICU with a bit more rigorous
attention to quality issues.

[1]
https://www.postgresql.org/message-id/CA%2BTgmoYmeGJaW%3DPy9tAZtrnCP%2B_Q%2BzRQthv%3Dzn_HyA_nqEDM-A%40mail.gmail.com

--
Jeff Davis
PostgreSQL Contributor Team - AWS

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2023-02-15 19:34:37 Re: doc: add missing "id" attributes to extension packaging page
Previous Message Justin Pryzby 2023-02-15 18:45:52 Re: run pgindent on a regular basis / scripted manner