Skip site navigation (1) Skip section navigation (2)

Re: strpos() && KMP

From: Pavel Ajtkulov <ajtkulov(at)acm(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: strpos() && KMP
Date: 2007-08-08 18:51:13
Message-ID: 656957725.20070808235113@acm.org (view raw or flat)
Thread:
Lists: pgsql-patches
Tom Lane writes:

> I wonder why you didn't propose Boyer-Moore instead, as that would have
> some advantage for natural language text as well.

> The difficulty with B-M is the need for a table indexed by character
> code, which at first glance looks impractical for wchars.  But it seems
> to me that we could use "wchar % 256" as the table index, meaning that
> wchars with the same trailing byte share the same table entry.  That
> would lose some efficiency compared to an exact implementation, but the
> limited table size would outweigh that except in the most pathological
> cases.

hash table?

The main difficulty with BM is verification and understanding "good
suffix shift" (the second part of BM) (I don't understand it entirely).

----
Ajtkulov Pavel
ajtkulov(at)acm(dot)org



In response to

Responses

pgsql-patches by date

Next:From: Brendan JurdDate: 2007-08-08 19:02:57
Subject: Re: Default location for docbook stylesheets in Gentoo
Previous:From: Peter EisentrautDate: 2007-08-08 18:29:57
Subject: Re: Default location for docbook stylesheets in Gentoo

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group