LIKE with pattern containing backslash

From: Jack Orenstein <jack(dot)orenstein(at)hds(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Matt McDonald <matt(dot)mcdonald(at)hds(dot)com>
Subject: LIKE with pattern containing backslash
Date: 2009-02-03 16:35:50
Message-ID: 49887266.3090905@hds.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Suppose I have this table:

create table test(id int, x varchar)

And I want to find rows whose x contains at least one backslash. The backslash
has to be escaped according to section 9.7.1 of the postgres docs.

select *
from test
where x like E'%\\%'

I'll skip the results of my psql experimentation to avoid having to think about
escaping backslashes from the command-line, inside psql.

My test data set has three rows:

0 a\b
1 a\\b
2 a\\\b

I wrote a JDBC test program, with two variants.

1) Searching with literals, escaping the backslash, e.g.

statement.executeQuery("select id, x from test where x like E'%\\%'")

This turns up all three rows. But this:

statement.executeQuery("select id, x from test where x like E'%\\\\%'")

doesn't retrieve any rows. From the docs, I would expect the second query to
retrieve rows with ids 1 and 2.

2) Avoiding literals completely, I created a PreparedStatement, and bound
variables containing the patterns, e.g.

PreparedStatement find = connection.prepareStatement("select id, x from
test where x like ?");
String oneBackslash = new String(new byte[]{'%', '\\', '%'});
find.setString(1, oneBackslash);
ResultSet resultSet = find.executeQuery();

Now, searching for %\% turns up nothing, while searching for %\\% turns up all
three rows.

BOTH behaviors seem wrong to me. In the first test (pattern specified as a
literal), it looks like %\\% is not matching strings that do contain two
backslashes. In the second test (pattern specified as a bound variable), it
looks like the first slash in each pattern is interpreted as an escape. Which I
didn't expect for a bound variable. Section 9.7.1 says "Note that the backslash
already has a special meaning in string literals ..." This seems to apply to
literals only, not to bound variables. And the need to have escapes in a bound
variable escapes me (so to speak).

Can someone help in my understanding of what is correct behavior (in the literal
and bound variable cases)? Is there a bug in the driver? in postgres? in the
docs? Or in my understanding?

Jack Orenstein

P.S. If you want to play with this, I can send you my test programs for the
cases described above.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Neth 2009-02-03 16:41:50 Re: Full text index not being used
Previous Message rhubbell 2009-02-03 16:32:15 Re: calculating elapsed times between timestamps