Re: WIP: index support for regexp search

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org, pavel(dot)stehule(at)gmail(dot)com
Subject: Re: WIP: index support for regexp search
Date: 2013-01-22 19:37:38
Message-ID: 50FEEA82.3010308@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18.12.2012 09:04, Alexander Korotkov wrote:
> Bug is found and fixed in attached patch.

I finally got around to look at this. I like this new version, without
the path matrix, much better.

I spend quite some time honing the code and comments, trying to organize
it so that it's easier to understand. In particular, I divided the
processing more clearly into four separate stages, and added comments
indicating which functions and which fields in the structs are needed in
which state.

I understand the other stages fairly well now, but the transformation
from the source CNFA form into the transformed graph is still a black
box to me. The addKeys/addArcs functions still need more explanation.
Can you come up with some extra comments or refactoring to clarify those?

I'd like to see a few more regression test cases, to cover the various
overflow cases. In particular, I built with --enable-coverage and ran
"make installcheck", and it looks like the state merging code isn't
exercised at all. Report attached.

To visualize the graphs, I rewrote the debugging print* functions to
write the graphs in graphviz .dot format. That helped a lot. See
attached graphs, generated from the regexp '^(abc|def)(ghi|jk[lmn])$'.

There's still a lot of cleanup to do, I'm going to continue working on
this tomorrow, but wanted to shared what I have this far.

- Heikki

Attachment Content-Type Size
trgm-regexp-0.9-heikki-1.patch text/x-diff 61.8 KB
trgm-regexp-lcov-report.tar.gz application/x-gzip 68.2 KB
packed.png image/png 77.9 KB
image/png 51.1 KB
image/png 44.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jameison Martin 2013-01-22 21:00:08 Re: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Previous Message Jeff Davis 2013-01-22 19:23:35 Re: Removing PD_ALL_VISIBLE