Re: Collation version tracking for macOS

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Jeremy Schneider <schneider(at)ardentperf(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, "Nasby, Jim" <nasbyj(at)amazon(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation version tracking for macOS
Date: 2022-11-15 00:55:25
Message-ID: 606bd2baa6d65b38fee6eb23bba40c5da210255b.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked at v6.

* We'll need some clearer instructions on how to build/install extra
ICU versions that might not be provided by the distribution packaging.
For instance, I got a cryptic error until I used --enable-rpath, which
might not be obvious to all users.
* Can we have a better error when the library was built with --
disable-renaming? We can just search for the plain (no suffix) symbol.
* We should use dlerror() instead of %m to report dlopen() errors.
* It seems like the collation version is just there to issue WARNINGs
when a user is using the non-versioned locale syntax and the library
changes underneath them (or if there is collation version change within
a single ICU major version)?
* How are you testing this?
* In my tests (sort, hacked so abbreviate is always false), I see a
~3% regression for ICU+UTF8. That's fine with me. I assume it's due to
the indirect function call, but that's not obvious to me from the
profile. If it's a major problem we could have a special case of
varstrfastcmp_locale() that works on the compile-time ICU version.

I realize your patch is experimental, but when there is a better
consensus on the approach, we should consider adding declarative syntax
such as:

CREATE COLLATION (or LOCALE?) PROVIDER icu67
TYPE icu VERSION '67' AS '/path/to/icui18n.so.67';

It will offer more opportunities to catch errors early and offer better
error messages. It would also enable it to function if the library is
built with --disable-renaming (though we'd have to trust the user).

On Sat, 2022-10-22 at 14:22 +1300, Thomas Munro wrote:
> Problem 1:  Suppose you're ready to start using (say) v72.  I guess
> you'd use the REFRESH command, which would open the main linked ICU's
> collversion and stamp that into the catalogue, at which point new
> sessions would start using that, and then you'd have to rebuild all
> your indexes (with no help from PG to tell you how to find everything
> that needs to be rebuilt, as belaboured in previous reverted work).
> Aside from the possibility of getting the rebuilding job wrong (as
> belaboured elsewhere), it's not great, because there is still a
> transitional period where you can be using the wrong version for your
> data.  So this requires some careful planning and understanding from
> the administrator.

How is this related to the search-by-collversion design? It seems like
it's hard no matter what.

--
Jeff Davis
PostgreSQL Contributor Team - AWS

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-11-15 00:57:26 Re: Avoid overhead open-close indexes (catalog updates)
Previous Message Andres Freund 2022-11-15 00:49:26 Re: Meson doesn't define HAVE_LOCALE_T for mscv