Re: "_" in a serach pattern

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Jessica Richard <rjessil(at)yahoo(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: "_" in a serach pattern
Date: 2007-07-22 04:41:34
Message-ID: 20070722044134.GA55486@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, Jul 20, 2007 at 09:06:53AM -0700, Jessica Richard wrote:
> I have a huge table that has a char(80) name column (indexed).

Do you really nead char(N) instead of varchar(N) or text?

> The name pattern is like
>
> A_B_C-D.123.456.pdf
> A_B_C-D.123.333.doc.2
>
> When I select name from table where name like 'A_B%', it was doing a
> table scan;

What does EXPLAIN ANALYZE show? When you created the database,
what locale did you use? What do "show lc_ctype" and "show
server_encoding" return?

> I guess the underscore "_" was treated like a special character,
> instead of a character "_".

Yes -- see the Pattern Matching section of the documentation:

http://www.postgresql.org/docs/8.2/interactive/functions-matching.html#FUNCTIONS-LIKE

> When I use select name from table where name like 'A\\_B%', it was doing
> an index scan with my result returned very fast,

What does EXPLAIN ANALYZE show for this query?

> but I got a warning about the \\:
>
> WARNING: nonstandard use of \\ in a string literal
> LINE 1: ...elect name from table where file_name like 'A\\_B...
> ^
> HINT: Use the escape string syntax for backslashes, e.g., E'\\'.

As the hint suggests, you can avoid the warning by using E'A\\_B%'.

> By the way, the wild card works if it is at the end of the query, it
> does not seem to work if it is in the middle:
>
> select name from table where name like 'A\\_B%' -- fine;
>
> select name from table were name like 'A\\_B%.pdf'; -- returns nothing.

This is probably due to the column type being char(N) instead of
varchar(N) or text:

test=> select 'abc'::char(3) like 'a%c';
?column?
----------
t
(1 row)

test=> select 'abc'::char(4) like 'a%c';
?column?
----------
f
(1 row)

test=> select 'abc'::char(4) like 'a%c%';
?column?
----------
t
(1 row)

test=> select 'abc'::char(4) like 'a%c ';
?column?
----------
t
(1 row)

I had thought that char(N)'s padding spaces were supposed to be
semantically insignificant but I don't know if that applies to
pattern matching.

> questions:
>
> 1. How do I get rid of the nonstandard warning, but still using the index search?

Do what the HINT suggests: use E'pattern' instead of 'pattern', or
use dollar quotes.

http://www.postgresql.org/docs/8.2/interactive/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING

If you're using a non-C locale then you'll need to create an index
using one of the pattern_ops operator classes to get pattern matches
to use an index.

http://www.postgresql.org/docs/8.2/interactive/indexes-opclass.html

> 2. How do I search with a wild card % in the middle of the pattern? Would
> varchar(80) or char(80) make a difference about the wild card search
> (% in the middle)?

See the above examples. I'd recommend using varchar(N) or text
unless you have a good reason to use char(N).

--
Michael Fuhr

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Peter Koczan 2007-07-22 05:54:04 Re: "_" in a serach pattern
Previous Message Simon Riggs 2007-07-21 11:26:50 Re: row level locking