From: | Alexander Farber <alexander(dot)farber(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Matching uppercased russian words (\x0410-\x042F) in UTF8 database 8.4.13 |
Date: | 2013-03-19 15:10:46 |
Message-ID: | CAADeyWjZUQU-mwN30rxZs_2A_HvBzrtWwd9kX+c+hmo5kfmN+w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
I have prepared an SQL fiddle for my question:
http://sqlfiddle.com/#!11/8a494/4
And also described it in more detail at
http://stackoverflow.com/questions/15500270/string-matching-in-insert-trigger-how-to-use-in-conditionals-to-return-null
Does anybody please know how to check for
UTF8 range \x0410-\x042F in my code below?
I've tried both
new.word !~ '^[\x0410-\x042F]{2,}$'
(fails with syntax error) and
new.word !~ '^[\u0410-\u042F]{2,}$'
(triggers even for correct words):
create table good_words (
word varchar(64) primary key
);
create or replace function keep_clean() returns trigger as $body$
begin
new.word := upper(new.word);
/* next line does not compile? */
IF new.word !~ '^[\x0410-\x042F]{2,}$' THEN
RAISE EXCEPTION 'Not an uppercased Russian word in UTF8';
END IF;
IF new.word ~ '^[ЪЫЬ]' OR new.word ~ 'Ъ$' THEN
return NULL;
END IF;
/* does not return NULL for 'ошибббка'? */
IF new.word ~ '(.)\1\1' AND new.word NOT LIKE '%ШЕЕЕ%'
AND new.word NOT LIKE '%ЗМЕЕЕ%' THEN
return NULL;
END IF;
return new;
end;
$body$ language plpgsql;
Thank you
Alex
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Farber | 2013-03-19 17:03:54 | Re: Matching uppercased russian words (\x0410-\x042F) in UTF8 database 8.4.13 |
Previous Message | Stephen Frost | 2013-03-19 13:46:16 | Re: Trust intermediate CA for client certificates |