Skip site navigation (1) Skip section navigation (2)

Re: "_" in a serach pattern

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Jessica Richard <rjessil(at)yahoo(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: "_" in a serach pattern
Date: 2007-07-22 04:41:34
Message-ID: 20070722044134.GA55486@winnie.fuhr.org (view raw or flat)
Thread:
Lists: pgsql-admin
On Fri, Jul 20, 2007 at 09:06:53AM -0700, Jessica Richard wrote:
> I have a huge table that has a char(80) name column (indexed).

Do you really nead char(N) instead of varchar(N) or text?

> The name pattern is like 
> 
> A_B_C-D.123.456.pdf
> A_B_C-D.123.333.doc.2
> 
> When I select name from table where name like 'A_B%', it was doing a
> table scan;

What does EXPLAIN ANALYZE show?  When you created the database,
what locale did you use?  What do "show lc_ctype" and "show
server_encoding" return?

> I guess the underscore "_" was treated like a special character,
> instead of a character "_".

Yes -- see the Pattern Matching section of the documentation:

http://www.postgresql.org/docs/8.2/interactive/functions-matching.html#FUNCTIONS-LIKE

> When I use select name from table where name like 'A\\_B%', it was doing
> an index scan with my result returned very fast,

What does EXPLAIN ANALYZE show for this query?

> but I got a warning about the \\:
> 
> WARNING:  nonstandard use of \\ in a string literal
> LINE 1: ...elect name from table where file_name like 'A\\_B...
>                                                              ^
> HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.

As the hint suggests, you can avoid the warning by using E'A\\_B%'.

> By the way, the wild card works if it is at the end of the query, it
> does not seem to work if it is in the middle:
> 
> select name from table where name like 'A\\_B%'  -- fine;
> 
> select name from table were name like 'A\\_B%.pdf'; -- returns nothing.

This is probably due to the column type being char(N) instead of
varchar(N) or text:

test=> select 'abc'::char(3) like 'a%c';
 ?column? 
----------
 t
(1 row)

test=> select 'abc'::char(4) like 'a%c';
 ?column? 
----------
 f
(1 row)

test=> select 'abc'::char(4) like 'a%c%';
 ?column? 
----------
 t
(1 row)

test=> select 'abc'::char(4) like 'a%c ';
 ?column? 
----------
 t
(1 row)

I had thought that char(N)'s padding spaces were supposed to be
semantically insignificant but I don't know if that applies to
pattern matching.

> questions:
> 
> 1. How do I get rid of the nonstandard warning, but still using the index search?

Do what the HINT suggests: use E'pattern' instead of 'pattern', or
use dollar quotes.

http://www.postgresql.org/docs/8.2/interactive/sql-syntax-lexical.html#SQL-SYNTAX-DOLLAR-QUOTING

If you're using a non-C locale then you'll need to create an index
using one of the pattern_ops operator classes to get pattern matches
to use an index.

http://www.postgresql.org/docs/8.2/interactive/indexes-opclass.html

> 2. How do I search with a wild card % in the middle of the pattern? Would
>  varchar(80) or char(80) make a difference about the wild card search
>  (% in the middle)?

See the above examples.  I'd recommend using varchar(N) or text
unless you have a good reason to use char(N).

-- 
Michael Fuhr

In response to

pgsql-admin by date

Next:From: Peter KoczanDate: 2007-07-22 05:54:04
Subject: Re: "_" in a serach pattern
Previous:From: Simon RiggsDate: 2007-07-21 11:26:50
Subject: Re: row level locking

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group