non-static LIKE patterns

From: patrick keshishian <pkeshish(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: non-static LIKE patterns
Date: 2012-04-11 22:08:34
Message-ID: CAN0yQBqy3m7Cko62=wtR3kTJR=+WKZZRCLxH5oqnA4jTmn5Rsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

I'm sure this has been discussed before, but I am not too sure what
key search-terms to use to find any potentially relevant discussions.

Issue: I have two tables, each has a column that contains a directory
path. First table contains a starting path and the second holds
sub-paths (retaining full path names from root directory). In short,
first table entries are sub-strings of those found in the second
table.

I need to match entries in second table to the first, so I use the
following in my WHERE clause:

... WHERE second.path LIKE first.path||'%'

This seemed to work at first, but it fails if the paths use
back-slashes (like Windows paths).

Following is a simple test-case to illustrate what I described.

PostgreSQL 9.1.1 (similar results with much older version)

$ psql -d db -e < testcase.sql
CREATE TEMPORARY TABLE foo (id INTEGER, a TEXT);
CREATE TABLE
CREATE TEMPORARY TABLE bar (id INTEGER, b TEXT);
CREATE TABLE
INSERT INTO foo VALUES (0, '/root/a/b');
INSERT 8030228 1
INSERT INTO foo VALUES (1, '\root\a\b');
INSERT 8030229 1
INSERT INTO bar VALUES (0, '/root/a/b/c/*nix');
INSERT 8030230 1
INSERT INTO bar VALUES (1, '\root\a\b\c\Windows');
INSERT 8030231 1
SELECT * FROM foo;
id | a
----+-----------
0 | /root/a/b
1 | \root\a\b
(2 rows)

SELECT * FROM bar;
id | b
----+---------------------
0 | /root/a/b/c/*nix
1 | \root\a\b\c\Windows
(2 rows)

SELECT a,b, b LIKE a||'%' FROM foo JOIN bar USING (id);
a | b | ?column?
-----------+---------------------+----------
/root/a/b | /root/a/b/c/*nix | t
\root\a\b | \root\a\b\c\Windows | f
(2 rows)

Hmm... just tried these two cases as well which seem interesting:

SELECT '\root\a\b\c\Windows' LIKE '\root\a\b'||'%';
?column?
----------
f
(1 row)

mod=# SELECT '\root\a\b\c\Windows' LIKE '\\root\\a\\b'||'%';
?column?
----------
t
(1 row)

Is this a bug in the SQL statement, or a bug in PostgreSQL? If the
former, what is the correct way to do this? If the latter, is there a
work-around?

I realize the same thing can be done with the following statement, but
it is harder to read and might be slightly more expensive to run on a
large data set.

SELECT a,b,substr(b,1,length(a)), substr(b,1,length(a)) = a FROM foo
JOIN bar USING (id);
a | b | substr | ?column?
-----------+---------------------+-----------+----------
/root/a/b | /root/a/b/c/*nix | /root/a/b | t
\root\a\b | \root\a\b\c\Windows | \root\a\b | t
(2 rows)

Thanks,
--patrick

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-04-11 23:07:28 Re: non-static LIKE patterns
Previous Message Michael Nolan 2012-04-11 19:09:53 Fwd: [HACKERS] [streaming replication] 9.1.3 streaming replication bug ?