Re: strpos() && KMP

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
Cc: decibel(at)decibel(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: strpos() && KMP
Date: 2007-08-07 22:02:48
Message-ID: 29290.1186524168@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Pavel Ajtkulov <ajtkulov(at)acm(dot)org> writes:
> Patch intends for artificial language (for example DNA, or
> language with small alphabet, or regular language) only.
> In natural language, KMP(and other search algo) haven't notable
> advantages (+-5% time execution).

I wonder why you didn't propose Boyer-Moore instead, as that would have
some advantage for natural language text as well.

The difficulty with B-M is the need for a table indexed by character
code, which at first glance looks impractical for wchars. But it seems
to me that we could use "wchar % 256" as the table index, meaning that
wchars with the same trailing byte share the same table entry. That
would lose some efficiency compared to an exact implementation, but the
limited table size would outweigh that except in the most pathological
cases.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Brendan Jurd 2007-08-08 11:36:35 Function structure in formatting.c
Previous Message Decibel! 2007-08-07 21:45:38 Re: strpos() && KMP