Re: Future of our regular expression code

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Future of our regular expression code
Date: 2012-02-19 23:42:03
Message-ID: 5843.1329694923@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Brendan Jurd <direvus(at)gmail(dot)com> writes:
> Are you far enough into the backrefs bug that you'd prefer to see it
> through, or would you like me to pick it up?

Actually, what I've been doing today is a brain dump. This code is
never going to be maintainable by anybody except its original author
without some internals documentation, so I've been trying to write
some based on what I've managed to reverse-engineer so far. It's
not very complete, but I do have some words about the DFA/NFA stuff,
which I will probably revise and fill in some more as I work on the
backref fix, because that's where that bug lives. I have also got
a bunch of text about the colormap management code, which I think
is interesting right now because that is what we are going to have
to fix if we want decent performance for Unicode \w and related
classes (cf the other current -hackers thread about regexes).
I was hoping to prevail on you to pick that part up as your first
project. I will commit what I've got in a few minutes --- look
for src/backend/regex/README in that commit. I encourage you to
add to that file as you figure stuff out. We could stand to upgrade
a lot of the code comments too, of course, but I think a narrative
description is pretty useful before diving into code.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-02-19 23:57:20 Re: 16-bit page checksums for 9.2
Previous Message Simon Riggs 2012-02-19 23:33:19 Re: 16-bit page checksums for 9.2