Re: ICU integration

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU integration
Date: 2016-09-30 14:30:28
Message-ID: 071e7c00-beba-480d-fa62-0fbcf3fa7360@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/23/16 2:27 AM, Thomas Munro wrote:
> This patch adds pkg-config support to our configure script, in order
> to produce the build options for ICU. That's cool, and I'm a fan of
> pkg-config, but it's an extra dependency that I just wanted to
> highlight. For example MacOSX appears to ship with ICU but has is no
> pkg-config or ICU .pc files out of the box (erm, I can't even find the
> headers, so maybe that copy of ICU is useless to everyone except
> Apple).

The Homebrew package icu4c contains this note:

OS X provides libicucore.dylib (but nothing else).

That probably explains what you are seeing.

> There is something missing from the configure script however: the
> output of pkg-config is not making it into CFLAGS or LDFLAGS, so
> building fails on FreeBSD and MacOSX where for example
> <unicode/ucnv.h> doesn't live in the default search path.

It sets ICU_CFLAGS and ICU_LIBS, but it seems I didn't put ICU_CFLAGS in
the backend makefiles. So that should be easy to fix.

> You call the built-in strcoll/strxfrm collation provider "posix", and
> although POSIX does define those functions, it only does so because it
> inherits them from ISO C90.

POSIX defines strcoll_l() and such. But I agree SYSTEM might be better.

> I notice that you use encoding -1, meaning "all".

The use of encoding -1 is existing behavior.

> I haven't fully
> grokked what that really means but it appears that we won't be able to
> create any new such collations after initdb using CREATE COLLATION, if
> for example you update your ICU installation and now have a new
> collation available. When I tried dropping some and recreating them
> they finished up with collencoding = 6. Should the initdb-created
> rows use 6 too?

Good observation. That will need some fine-tuning.

> The warning persists even after I reindex all affected tables (and
> start a new backend), because you're only recording the collation at
> pg_collation level and then only setting it at initdb time, so that
> HINT isn't much use for clearing the warning. I think you'd need to
> record a snapshot of the version used for each collation that was used
> on *each index*, and update it whenever you CREATE INDEX or REINDEX
> etc. There can of course be more than one collation and thus version
> involved, if you have columns with different collations, so I guess
> you'd need an column to hold an array of version snapshots whose order
> corresponds to pg_index.indcollation's.

Hmm, yeah, that will need more work. To be completely correct, I think,
we'd also need to record the version in each expression node, so that
check constraints of the form CHECK (x > 'abc') can be handled.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2016-09-30 14:53:56 Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"
Previous Message Tom Lane 2016-09-30 14:25:41 Re: [GENERAL] pg_upgrade from 9.5 to 9.6 fails with "invalid argument"