Re: Finding broken regex'es

From: "Dawid Kuroczko" <qnex42(at)gmail(dot)com>
To: weigelt(at)metux(dot)de
Cc: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Finding broken regex'es
Date: 2007-10-03 13:36:41
Message-ID: 758d5e7f0710030636m63bec244paa6cbaf04e66b3d2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 10/2/07, Enrico Weigelt <weigelt(at)metux(dot)de> wrote:
>
> Hi folks,
>
>
> I'm looking for some way to find broken regex'es in some column
> to kick them off. For now I'm regularily fetching all regexes
> from an PHP script, try an preg_match() and so find the broken
> ones to later remove them.
>
> Is there any way to do this directly within the db ?

Of course. Exceptions is what You need!

CREATE FUNCTION regex_is_broken(r text) RETURNS boolean AS $$
BEGIN
PERFORM '' ~ r;
RETURN 'f';
EXCEPTION
WHEN INVALID_REGULAR_EXPRESSION THEN
RETURN 't';
END;
$$ LANGUAGE PLpgSQL STRICT IMMUTABLE;

...and then you could do something like:

DELETE FROM table WHERE regex_is_broken(rx_col);

You don't need PLpgSQL to prevent such invalid regexes in the
first place. You could use CHECK constraint for it:

CREATE TABLE rx_check (
rx text CHECK ('' ~ rx IN ('t','f'))
);

postgres=> INSERT INTO rx_check (rx) VALUES ('.*');
INSERT 0 1
Time: 13.660 ms
postgres=> INSERT INTO rx_check (rx) VALUES ('234234');
INSERT 0 1
Time: 2.282 ms
postgres=> INSERT INTO rx_check (rx) VALUES ('par).*');
ERROR: invalid regular expression: parentheses () not balanced

Regards,
Dawid

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Filip Rembiałkowski 2007-10-03 14:40:35 Re: Finding broken regex'es
Previous Message Tom Lane 2007-10-03 06:03:04 Re: Finding broken regex'es