From: | mark(at)mark(dot)mielke(dot)cc |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: like/ilike improvements |
Date: | 2007-05-25 05:20:16 |
Message-ID: | 20070525052016.GA6825@mark.mielke.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Thu, May 24, 2007 at 11:20:51PM -0400, Tom Lane wrote:
> I wrote:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> >> Yes, I agree completely. However it looks to me like IsFirstByte will in
> >> fact always be true when we get to call NextChar for matching "_" for UTF8.
> > If that's true, the patch is failing to achieve its goal of treating %
> > bytewise ...
> OK, I studied it a bit more and now see what you're driving at: in this
> form of the patch, we treat % bytewise unless it is followed by _, in
> which case we treat it char-wise. That seems a good tradeoff,
> considering that such a pattern is probably pretty uncommon --- we
> should be willing to handle it a bit slower to simplify other cases.
Is it worth the effort to pre-process the pattern?
For example:
%% -> %
%_ -> _%
If applied recursively, this would automatically cover:
%_% -> _%
_%_ -> __%
The 'benefit' would be that the pattern matching code would not
need an inner if statement?
Also - I didn't see a response to my query with regard treating UTF-8
as a two pass match. First pass treating it as bytes. If the first pass
matches, the second pass doing a full analysis. In the case of low
selectivity, this will be a win, as the primary filter would be the
full speed byte-based matching.
I had also asked why the focus would be on high selectivity. Why would
the primary filter criteria for a properly designed select statement by
a like with high selectivity? The only time I have ever used like is
when I expect low selectivity. Is there a reasonable case I am missing?
Cheers,
mark
--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada
One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...
From | Date | Subject | |
---|---|---|---|
Next Message | Guillaume Smet | 2007-05-25 07:36:38 | Re: Why not keeping positions in GIN? |
Previous Message | Hitoshi Harada | 2007-05-25 05:11:24 | Why not keeping positions in GIN? |
From | Date | Subject | |
---|---|---|---|
Next Message | Zeugswetter Andreas ADI SD | 2007-05-25 08:16:59 | Re: like/ilike improvements |
Previous Message | Andrew Dunstan | 2007-05-25 03:34:13 | Re: like/ilike improvements |