Re: Better locale-specific-character-class handling for regexps

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Bruno Wolff III <bruno(at)wolff(dot)to>
Subject: Re: Better locale-specific-character-class handling for regexps
Date: 2016-09-05 17:20:55
Message-ID: 4b88a3a4-ffd5-7fb7-7f9d-d98138284f90@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/05/2016 07:10 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> On 09/04/2016 08:44 PM, Tom Lane wrote:
>>> I guess I could follow the lead of collate.linux.utf8.sql and produce
>>> a test that's only promised to pass on one platform with one encoding,
>>> but I'm not terribly excited by that. AFAIK that test file does not
>>> get run at all in the buildfarm or in the wild.
>
>> I'm not too worried if the tests don't get run regularly, but I don't
>> like the idea that only works on one platform.
>
> Well, it would work on any platform that reports high Unicode letters
> as letters. The problem for putting this into the regular regression
> tests is that the generic tests don't even assume UTF8 encoding, let
> alone a Unicode-ish locale.

Ah, ok. I thought there were some more special requirements.

>> Since we're now de facto maintainers of this regexp library, and our
>> version could be used somewhere else than PostgreSQL too, it would
>> actually be nice to have a regression suite that's independent from the
>> pg_regress infrastructure, and wouldn't need a server to run.
>
> If anyone ever really picks up the challenge of making the regexp library
> a standalone project, I think one of the first orders of business would be
> to pull out the Tcl project's regexp-related regression tests. There's a
> pretty extensive set of tests written by Henry Spencer himself, and more
> that they added over the years; it's far more comprehensive than our
> tests. (I've looked at stealing that test set in toto, but it requires
> some debug APIs that we don't expose in SQL, and probably don't want to.)

Oh, interesting.

> In any case, this is getting very far afield from the current patch.
> I'm willing to add a regexp.linux.ut8.sql test file if you think it's
> important to have some canned tests that exercise this new code, but
> otherwise I don't see any near-term solution.

Ok, that'll do.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Convey 2016-09-05 17:25:03 Suggestions for first contribution?
Previous Message Bruce Momjian 2016-09-05 17:14:54 Re: Fun fact about autovacuum and orphan temp tables