Re: Future of our regular expression code

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Future of our regular expression code
Date: 2012-02-20 03:43:18
Message-ID: 18230.1329709398@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> ... We need a library that can be used to defend
> against malicious regexes and i suspect neither Perl's nor Python's
> library will suffice for this.

Yeah. Did you read the Russ Cox papers referenced upthread? One of the
things Google wanted was provably limited resource consumption, which
motivated them going with a pure-DFA-no-exceptions implementation.
However, they gave up backrefs to get that, which is probably a
compromise we're not willing to make.

One thing that's been bothering me for awhile is that we don't have any
CHECK_FOR_INTERRUPTS or equivalent in the library's NFA search loops.
It wouldn't be hard to add one but that'd be putting PG-specific code
into the very heart of the library, which is something I've tried to
resist. One of the issues we'll have to face if we do try to split it
out as a standalone library is how that type of requirement can be met.
(And, BTW, that's the kind of hack that we would probably not get to
make at all with any other library, so the need for it is not evidence
that getting away from Spencer's code would be a good thing.)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-02-20 04:10:21 Re: wal_buffers
Previous Message Stephen Frost 2012-02-20 03:38:05 Re: Future of our regular expression code