From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Possible solution for LIKE optimization |
Date: | 2001-08-05 22:28:20 |
Message-ID: | Pine.LNX.4.30.0108052253470.10846-100000@peter.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I have had an idea how the LIKE optimization problem could be solved.
The problem was that given the expression
A LIKE (P || '%')
where A is a column reference and P is a constant not containing
wildcards, we wanted to find X and Y such that
X <= A and A <= Y
where X and Y are calculated from P and the bound is usefully tight.
Note that X <= A is really strcoll(X, A) <= 0 and strcoll(X, A) is really
strcmp(strxfrm(X), strxfrm(A)) (the prototype of strxfrm() is different in
C, of course).
Let $<=$ be a non-locale based comparison (i.e., plain strcmp()).
The we can say that
strxfrm(P) $<=$ strxfrm(A) and
strxfrm(A) $<=$ (strxfrm(P) + 1)
where "+ 1" means adding 1 to the last character and accounting for
overflow (like considering the string a base-256 number). Basically, this
is the funny-collation-safe equivalent to A LIKE 'foo%' yielding 'foo' <=
A <= 'fop'.
We'd need to implement the strxfrm() function in SQL and the $<=$
operator, both of which are trivial. The index would have to be in terms
of strxfrm(). There might be other issues, but they could be solved
algorithmically, I suppose.
What do you think?
--
Peter Eisentraut peter_e(at)gmx(dot)net http://funkturm.homeip.net/~peter
From | Date | Subject | |
---|---|---|---|
Next Message | Digital Wokan | 2001-08-05 22:44:19 | Sorry about that unsubscribe |
Previous Message | Peter Eisentraut | 2001-08-05 20:37:18 | Re: Patch for Improved Syntax Error Reporting |