Re: strpos() && KMP

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: strpos() && KMP
Date: 2007-09-26 08:48:11
Message-ID: 200709260848.l8Q8mBv16140@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Added to TODO:

> * Implement Boyer-Moore searching in strpos()
>
> http://archives.postgresql.org/pgsql-patches/2007-08/msg00012.php

---------------------------------------------------------------------------

Pavel Ajtkulov wrote:
> Hello,
>
> this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function
> (see Cormen et al. Introduction to Algorithms, MIT Press, 2001).
>
> It also works with multibyte wchar.
>
> In worst case current brute force strpos() takes O(n * m) (n && m is length of strings)
> time (example: 'aaa...aaab' search in 'aaa...aaa').
> KMP algo always takes O(n + m) time.
> To check this someone need to create a table with one text attribute, and insert several thousands
> record 'aa..aa'(for example, with lenght = 1000) . After execute "select count(*) from test where
> strpos(a, 'aaa....aab') > 0;" on current and modified version.
>
> Also, I advise to use "select .. where strpos(att, 'word') > 0;" instead "select .. where attr like '%word%'"
> (strpos must be faster than regex).
>
> In general, this belongs to artificial expressions. In natural language KMP is equal (execution time)
> current strpos() nearly.
>
>
> ----
> Ajtkulov Pavel
> ajtkulov(at)acm(dot)org
>
> P. S. Sorry for prime English.

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2007-09-26 08:52:35 Re: [HACKERS] Include Lists for Text Search
Previous Message Bruce Momjian 2007-09-26 08:38:54 Re: allow CSV quote in NULL