From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | David Rowley <dgrowley(at)gmail(dot)com>, "'Peter Eisentraut'" <peter_e(at)gmx(dot)net>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker) |
Date: | 2008-09-05 02:04:57 |
Message-ID: | 11787.1220580297@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> After reading the wikipedia article on Boyer-Moore search algorithm, it
> looks to me like this patch actually implements the simpler
> Boyer-Moore-Horspool algorithm that only uses one lookup table. That's
> probably fine, as it ought to be faster on small needles and haystacks
> because it requires less effort to build the lookup tables, even though
> the worst-case performance is worse. It should still be faster than what
> we have now.
Hmm. B-M-H has worst case search speed O(M*N) (where M = length of
pattern, N = length of search string); whereas full B-M is O(N).
Maybe we should build the second table when M is large? Although
realistically that is probably gilding the lily, since frankly there
haven't been many real-world complaints about the speed of these
functions anyway ...
> The skip table really should be constructed only once in
> text_position_start and stored in TextPositionState. That would make a
> big difference to the performance of those functions that call
> text_position_next repeatedly: replace_text, split_text and text_to_array.
+1. The heuristic about big a skip table to make may need some
adjustment as well, since it seems to be considering only a single
search.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-09-05 02:17:07 | Re: hash index improving v3 |
Previous Message | Andriy Bakay | 2008-09-05 02:01:51 | Re: SSL problems |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-09-05 02:17:07 | Re: hash index improving v3 |
Previous Message | Alex Hunsaker | 2008-09-05 01:54:51 | Re: hash index improving v3 |