strpos() && KMP

From: Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
To: pgsql-patches(at)postgresql(dot)org
Subject: strpos() && KMP
Date: 2007-08-01 20:18:22
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function
(see Cormen et al. Introduction to Algorithms, MIT Press, 2001).

It also works with multibyte wchar.

In worst case current brute force strpos() takes O(n * m) (n && m is length of strings)
time (example: 'aaa...aaab' search in '').
KMP algo always takes O(n + m) time.
To check this someone need to create a table with one text attribute, and insert several thousands
record 'aa..aa'(for example, with lenght = 1000) . After execute "select count(*) from test where
strpos(a, 'aaa....aab') > 0;" on current and modified version.

Also, I advise to use "select .. where strpos(att, 'word') > 0;" instead "select .. where attr like '%word%'"
(strpos must be faster than regex).

In general, this belongs to artificial expressions. In natural language KMP is equal (execution time)
current strpos() nearly.

Ajtkulov Pavel

P. S. Sorry for prime English.

Attachment Content-Type Size
varlena.c.patch application/octet-stream 5.4 KB


Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-08-01 22:41:31 Re: strpos() && KMP
Previous Message Heikki Linnakangas 2007-08-01 20:09:06 Re: HOT patch - version 11