From: | Francisco Olarte <folarte(at)peoplecall(dot)com> |
---|---|
To: | Denisa Cirstescu <Denisa(dot)Cirstescu(at)tangoe(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Regex Replace with 2 conditions |
Date: | 2018-02-05 14:26:46 |
Message-ID: | CA+bJJbz2QG8Wg037OBxr0TPg1+-7YkwK5ikpY7tAoETCXmOZKw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Denisa:
On Mon, Feb 5, 2018 at 2:34 PM, Denisa Cirstescu
<Denisa(dot)Cirstescu(at)tangoe(dot)com> wrote:
> I need an SQL function that eliminates all ASCII characters from 1-255 that
> are not A-Z, a-z, 0-9, and special characters % and _ so something like:
Are you aware ASCII is a SEVEN bit code ?
And now, why don't you just write the negated condition, maybe
throwing in a null to avoid it? Do you have codes above 255 which you
do not need replacing?
I.e., something like
SELECT regexp_replace(p_string, E'[^A-Za-z0-9%_]', '', 'g'));
This will also zap \0 and all chars >255 if you are using unicode, if
this is not a problem that's all there is to it.
If you are using it you could throw a null plus a character range from
256 to the largest one, but I doubt this is useful. Which is the
character set of your source data? ( It can NOT be ascii if you are
worried about 128-255, but is it a single byte one or is it unicode or
something wide ? )
Also, it may perform a bit faster if you throw a + after the character
class ( for >1 char runs ).
Francisco Olarte.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-02-05 14:43:06 | Re: Regex Replace with 2 conditions |
Previous Message | Denisa Cirstescu | 2018-02-05 13:34:37 | Regex Replace with 2 conditions |