| From: | Pavel Ajtkulov <ajtkulov(at)acm(dot)org> |
|---|---|
| To: | pgsql-patches(at)postgresql(dot)org |
| Subject: | strpos() && KMP |
| Date: | 2007-08-01 20:18:22 |
| Message-ID: | 1167412014.20070802011822@acm.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-patches |
Hello,
this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function
(see Cormen et al. Introduction to Algorithms, MIT Press, 2001).
It also works with multibyte wchar.
In worst case current brute force strpos() takes O(n * m) (n && m is length of strings)
time (example: 'aaa...aaab' search in 'aaa...aaa').
KMP algo always takes O(n + m) time.
To check this someone need to create a table with one text attribute, and insert several thousands
record 'aa..aa'(for example, with lenght = 1000) . After execute "select count(*) from test where
strpos(a, 'aaa....aab') > 0;" on current and modified version.
Also, I advise to use "select .. where strpos(att, 'word') > 0;" instead "select .. where attr like '%word%'"
(strpos must be faster than regex).
In general, this belongs to artificial expressions. In natural language KMP is equal (execution time)
current strpos() nearly.
----
Ajtkulov Pavel
ajtkulov(at)acm(dot)org
P. S. Sorry for prime English.
| Attachment | Content-Type | Size |
|---|---|---|
| varlena.c.patch | application/octet-stream | 5.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Dunstan | 2007-08-01 22:41:31 | Re: strpos() && KMP |
| Previous Message | Heikki Linnakangas | 2007-08-01 20:09:06 | Re: HOT patch - version 11 |