Re: Collation version tracking for macOS

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, "Nasby, Jim" <nasbyj(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation version tracking for macOS
Date: 2022-06-10 00:32:23
Message-ID: CAH2-Wz=G3Y67WDGE_Vv1_wxnZoYxa+ie8YYgjGi98DzFa1tHGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 9, 2022 at 5:18 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> However, since you mentioned that a simple REINDEX would get you from
> one library version to another, I think we're making some completely
> different assumptions somewhere along the line, and I don't get your
> idea yet. It sounds like you don't want two different collation OIDs
> in that case?

Not completely sure about the REINDEX behavior, but it's at least an
example of the kind of thing that could be enabled. I'm proposing that
pg_collation-wise collations have the most abstract possible
definitions -- "logical collations", which are decoupled from
"physical collations" that actually describe a particular ICU collator
associated with a particular ICU version (all the information that
keeps how the on-disk structure is organized for a given relfilenode
straight). In other words, the definition of a collation is the user's
own definition. To the user, it's pretty close to (maybe even exactly)
a BCP47 string, now and forever.

You can make arguments against the REINDEX behavior. And maybe those
arguments will turn out to be good arguments. Assuming that they are,
then the solution may just be to have a special option that will make
the REINDEX use the most recent library.

The important point is to make the abstraction as high level as
possible from the point of view of users.

> You seem to have some
> other idea in mind where the system only knows about one
> "en-US-x-icu", but somehow, somewhere else (where?), keeps track of
> which indexes were built with ICU 63 and which with ICU 67, which I
> don't yet grok. Or did I misunderstand?

That's what I meant, yes -- you got it right.

Another way to put it would be to go as far as we can in the direction
of decoupling the concerns that we have as database people from the
concerns of natural language experts. Let's not step on their toes,
and let's avoid having our toes trampled on.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tobias Bussmann 2022-06-10 00:48:33 Re: Collation version tracking for macOS
Previous Message David G. Johnston 2022-06-10 00:30:27 doc: array_length produces null instead of 0