Re: Collation version tracking for macOS

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, "Nasby, Jim" <nasbyj(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation version tracking for macOS
Date: 2022-06-11 03:35:17
Message-ID: CAH2-WznSi6muvAZsocdGCGnACL+NMziyXEkjDfz0n30BYUwS9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 9, 2022 at 9:31 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> Perhaps that could be modeled with a pg_depend row pointing to a
> pg_icu_library row, which you'd probably need anyway, to prevent a
> registered ICU library that is needed for a live index from being
> dropped. (That's assuming that the pg_icu_library catalogue concept
> has legs... well if we're going with dlopen(), we'll need *somewhere*
> to store the shared object paths. Perhaps it's not a given that we
> really want paths in a table... I guess it might prevent certain
> cross-OS streaming rep scenarios, but mostly that'd be solvable with
> symlinks...)

Do we even need to store a version for indexes most of the time if
we're versioning ICU itself, as part of the "time travelling
collations" design? For that matter, do we even need to version
collations directly anymore?

I'm pretty sure that the value of pg_collation.collversion is always
the same in practice, or has a lot of redundancy. Because mostly it's
just an ICU version. This is what I see on my system, at least:

pg(at)regression:5432 [53302]=# select count(*), collversion from
pg_collation where collprovider = 'icu' group by 2;
count │ collversion
───────┼─────────────
329 │ 153.112.41
471 │ 153.112
(2 rows)

(Not sure why there are two different distinct collversion values
offhand, but generally looks like collversion isn't terribly
meaningful at the level of individual pg_collation entries.)

If indexes and constraints with old physical collations are defined as
being the exception to the general rule (the rule meaning "every index
uses the current ICU version for the database as a whole"), and if
those indexes/constraints are enumerated and stored (in a new system
catalog) when a switchover of the database's ICU version is first
initialized, then there might not be any meaningful dependency to
speak of. Not for indexes, at least.

The *database as a whole* is dependent on the current version of ICU
-- it's not any one index. Very occasionally the database will also be
dependent on a single older ICU version that we're still transitioning
away from. There is a "switch-a-roo" going on, but not really at the
level of indexes -- it's a very specialized thing, that works at the
level of the whole database, and involves exactly 2 ICU versions. You
should probably be able to back out of it once it begins, but mostly
it's an inflexible process that just does what we need it to do.

Does something like that seem sensible to you?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-06-11 03:46:53 Re: Collation version tracking for macOS
Previous Message Peter Geoghegan 2022-06-11 02:28:44 Re: Collation version tracking for macOS