Collation versions on Windows (help wanted, apply within)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Collation versions on Windows (help wanted, apply within)
Date: 2019-11-07 23:44:13
Message-ID: CA+hUKGJvqup3s+JowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

Here's a draft patch that teaches PostgreSQL how to ask for collation
versions on Windows. It receives a pair of DWORDs, which, it displays
as hex. The values probably have an internal structure that is
displayed in a user-friendly way by software like Active Directory and
SQL Server (I'm pretty sure they both track collation versions and
reindex), but I don't know. It is based on the documentation at:

https://docs.microsoft.com/en-us/windows/win32/win7appqual/nls-sorting-changes

My understanding of our OS and tool chain version strategy on that
platform is limited, but it looks like we only allow ourselves to use
Vista (and later) APIs if the compiler is Visual Studio 2015 (aka
14.0) or later. So I tested that this builds cleanly on AppVeyor
using that compiler (see attached CI patch). The regression tests
failed with Windows error code 87 before I added in the check to skip
"C" and "POSIX", so I know the new code is reached. I don't have an
environment to test it beyond that.

The reason for returning an empty string for "C" and "POSIX" is the
following comment for get_collation_actual_version():

* A particular provider must always either return a non-NULL string or return
* NULL (if it doesn't support versions). It must not return NULL for some
* collcollate and not NULL for others.

I'm not sure why, or if that really makes sense.

Do any Windows hackers want to help get it into shape? Some things to
do: test it, verify that the _WIN32_WINNT >= 0x0600 stuff makes sense
(why do we target such ancient Windows releases anyway?), see if there
is way we could use GetNLSVersion() (no "Ex") to make this work on
older Windows system, check if it makes sense to assume that
collcollate is encoded with CP_ACP ("the system default Windows ANSI
code page", used elsewhere in the PG source tree for a similar
purpose, but this seems likely to go wrong for locale names that have
non-ASCII characters, and indeed we see complaints on the lists
involving the word "Bokmål"), and recommend a better way to display
the collation version as text. I'll add this to the next commitfest
to attract some eyeballs (but note that when cfbot compiles it, it
will be using an older tool chain and Win32 target, so the new code
will be ifdef'd out and regression test success means nothing).

To test that it works, you'd need to look at the contents of
pg_collation to confirm that you see the new version strings, create
an index on a column that explicitly uses a collation that has a
version, update the pg_collation table by hand to have a to a
different value, and then open a new session and to access the index
to check that you get a warning about the version changing. The
warning can be cleared by using ALTER COLLATION ... REFRESH VERSION.

Attachment Content-Type Size
0001-Add-collation-versions-for-Windows.patch application/octet-stream 1.7 KB
0002-CI.patch application/octet-stream 2.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-11-07 23:46:13 Re: Should we make scary sounding, but actually routine, errors less scary?
Previous Message Andres Freund 2019-11-07 23:41:04 Re: log bind parameter values on error