Skip site navigation (1) Skip section navigation (2)

LIKE, CHAR(), and trailing spaces

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: LIKE, CHAR(), and trailing spaces
Date: 2011-02-02 23:54:42
Message-ID: 201102022354.p12NsgR06340@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackers
I found a little LIKE/CHAR() surprise --- below is a table and query
against a CHAR(10) field:

	test=> CREATE TABLE test (x char(10));
	CREATE TABLE

	test=> INSERT INTO test values ('hi');
	INSERT 0 1

	test=> SELECT * FROM test WHERE x = 'hi';
	     x
	------------
	 hi
	(1 row)

The above works because both sides are converted to 'bpchar';  explain
shows that:

	test=> EXPLAIN SELECT * FROM test WHERE x = 'hi';
	                      QUERY PLAN
	------------------------------------------------------
	 Seq Scan on test  (cost=0.00..33.12 rows=9 width=14)
	   Filter: (x = 'hi'::bpchar)
	                      ^^^^^^
	(2 rows)

The following does not work:
	
	test=> SELECT * FROM test WHERE x LIKE 'hi';
	 x
	---
	(0 rows)


It seems LIKE is considering the trailing CHAR(10) field spaces as
significant, even though our documentations says:

    Values of type <type>character</type> are physically padded
    with spaces to the specified width <replaceable>n</>, and are
    stored and displayed that way.  However, the padding spaces are
    treated as semantically insignificant.  Trailing spaces are
--> disregarded when comparing two values of type <type>character</type>,
    and they will be removed when converting a <type>character</type> value
    to one of the other string types.  Note that trailing spaces
    <emphasis>are</> semantically significant in
    <type>character varying</type> and <type>text</type> values.

It says trailing spaces are not significant for character comparisons
--- the real question is whether LIKE is a comparison.  Obvioiusly '='
is a comparison, but the system does not treat LIKE as a comparison in
terms of trailing spaces.  Is that desired behavior?

I did an EXPLAIN on the query and found '~~' was being used and 'hi' was
being converted to text:

	test=> explain select * from test where x like 'hi';
	                      QUERY PLAN
	------------------------------------------------------
	 Seq Scan on test  (cost=0.00..33.12 rows=9 width=14)
	   Filter: (x ~~ 'hi'::text)
                      ^^       ^^^^
	(2 rows)

so I then checked psql \do to see what operators there were for ~~:

	test=> \do ~~
	                                     List of operators
	   Schema   | Name | Left arg type | Right arg type | Result type |       Description
	------------+------+---------------+----------------+-------------+-------------------------
	 pg_catalog | ~~   | bytea         | bytea          | boolean     | matches LIKE expression
-->	 pg_catalog | ~~   | character     | text           | boolean     | matches LIKE expression
	 pg_catalog | ~~   | name          | text           | boolean     | matches LIKE expression
	 pg_catalog | ~~   | text          | text           | boolean     | matches LIKE expression
	(4 rows)

The one marked matches the arguments so it seems the comparison being
done is not character and character, but character and text.

I realize trim() could be used to get the desired behavior, but is our
behavior consistent?

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Responses

pgsql-hackers by date

Next:From: Greg SmithDate: 2011-02-03 00:03:06
Subject: Re: [HACKERS] Slow count(*) again...
Previous:From: Tom LaneDate: 2011-02-02 23:42:56
Subject: Re: ALTER EXTENSION UPGRADE, v3

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group