Re: LIKE with pattern containing backslash

From: Jack Orenstein <jack(dot)orenstein(at)hds(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, daniel(at)manitou-mail(dot)org
Cc: pgsql-general(at)postgresql(dot)org, Matt McDonald <matt(dot)mcdonald(at)hds(dot)com>
Subject: Re: LIKE with pattern containing backslash
Date: 2009-02-03 18:53:38
Message-ID: 498892B2.8090307@hds.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:

> My Java is pretty weak, but doesn't it think that backslash is special
> in string literals? If I assume that each pair of \'s in your examples
> went to the database as one \, then the results are what I'd expect.
>
> regards, tom lane

Daniel Verite wrote:

> LIKE E'%\\%' will match a string that ends with a percent sign, not a string
> that contains a backslash. That's because the backslash acts additionally as
> the default escape character for LIKE patterns. You can add ESCAPE '' after
> the LIKE > statement to avoid that.
> Otherwise you get really two levels of different backslash interpretation
> here, one for the string parser and one for the LIKE operator, and that
> doesn't take into account any additional level needed if you embed the query
> into a source code string.

OK, let's skip the first variant of the test, which relies on escapes in java
strings. In the second variant, there are no literal strings involved at all,
because I create a String from a char[] in which the characters are specified
individually:

PreparedStatement find = connection.prepareStatement
("select id, x from test where x like ?");
String oneBackslash = new String(new char[]{'%', '\\', '%'});
find.setString(1, oneBackslash);
ResultSet resultSet = find.executeQuery());

In this code, The string passed to JDBC is %\% (percent, one backslash,
percent), and no rows are returned. It appears as if the one backslash is being
treated as an escape for the %. And if I add a row (3, a\%) then that row DOES
get returned.

So back to the documented behavior of LIKE: "Note that the backslash already has
a special meaning in string literals, so to write a pattern constant that
contains a backslash you must write two backslashes in an SQL statement". Is
this statement meant to apply to the *value* of the RHS operand of LIKE? Or to
strings expressed as literals in general?

By referring to "string literals" and "pattern constant", the doc sounds like it
is describing how escapes are handled in strings. My example above does not rely
on a string literal for the RHS of LIKE. If escape processing is supposed to
occur for the RHS of LIKE, regardless of how the pattern is expressed, then I
believe the doc is confusing. (It confused me, anyway.) Another possibility is
that the doc is correct, and that the driver is getting things wrong. For now,
I'm trying to understand what the correct behavior is.

Jack

Here is my entire test program. It runs standalone - just provide the database
name, username, password on the command line.

import java.sql.*;

public class LikeVsBackslash_varchar
{
public static void main(String[] args) throws Exception
{
String database = args[0];
String username = args[1];
String password = args[2];
Class.forName("org.postgresql.Driver").newInstance();
Connection connection =
DriverManager.getConnection(String.format("jdbc:postgresql:%s", database),
username, password);
Statement statement = connection.createStatement();
statement.execute("drop table if exists test");
statement.execute("create table test(id int, x varchar)");
PreparedStatement insert = connection.prepareStatement("insert into
test values(?, ?)");
PreparedStatement find = connection.prepareStatement("select id, x from
test where x like ?");
// Insert "a\b"
insert.setInt(1, 0);
insert.setString(2, new String(new char[]{'a', '\\', 'b'}));
insert.executeUpdate();
// Insert "a\\b"
insert.setInt(1, 1);
insert.setString(2, new String(new char[]{'a', '\\', '\\', 'b'}));
insert.executeUpdate();
// Insert "a\\\b"
insert.setInt(1, 2);
insert.setString(2, new String(new char[]{'a', '\\', '\\', '\\', 'b'}));
insert.executeUpdate();
// Insert "a\%"
insert.setInt(1, 3);
insert.setString(2, new String(new char[]{'a', '\\', '%'}));
insert.executeUpdate();
// Find rows with at least one backslash
String oneBackslash = new String(new char[]{'%', '\\', '%'});
find.setString(1, oneBackslash);
report(String.format("Rows matching %s", oneBackslash),
find.executeQuery());
// Find rows with at least two backslashes
String twoBackslashes = new String(new char[]{'%', '\\', '\\', '%'});
find.setString(1, twoBackslashes);
report(String.format("Rows matching %s", twoBackslashes),
find.executeQuery());
}

private static void report(String label, ResultSet resultSet) throws
SQLException
{
System.out.println(String.format("%s: {", label));
while (resultSet.next()) {
int id = resultSet.getInt(1);
String x = resultSet.getString(2);
System.out.println(String.format("%s: %s (%s backslashes)", id, x,
backslashes(x)));
}
System.out.println("}");
}

private static String toString(byte[] x)
{
StringBuffer buffer = new StringBuffer();
for (byte b : x) {
buffer.append((char)b);
}
return buffer.toString();
}

private static int backslashes(String x)
{
int count = 0;
for (int i = 0; i < x.length(); i++) {
if (x.charAt(i) == '\\') {
count++;
}
}
return count;
}
}

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phoenix Kiula 2009-02-03 19:02:39 Re: Fastest way to drop an index?
Previous Message Mohamed 2009-02-03 18:50:46 Re: Adding Arabic dictionary for TSearch2.. to_tsvector('arabic'...) doesn't work..