Skip site navigation (1) Skip section navigation (2)

Comparison semantics of CHAR data type

From: Thomas Fanghaenel <tfanghaenel(at)salesforce(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Comparison semantics of CHAR data type
Date: 2013-09-23 00:51:26
Message-ID: CAK+WP1xdmyswEehMuetNztM4H199Z1w9KWRHVMKzyyFM+hV=zA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-sql
I was wondering about the proper semantics of CHAR comparisons in some
corner cases that involve control characters with values that are less than
0x20 (space).

Consider the following testcase:

===
create table t (a int, b char(10));

insert into t values (1, 'foo');
insert into t values (2, 'foo ');
insert into t values (3, E'foo\t');
insert into t values (4, E'foo\n');
insert into t values (5, E'foo \n');
insert into t values (6, 'foobar');

select * from t order by b;
===

What's the proper order of these string values in the CHAR domain?  The way
I interpret the SQL Standard (and assuming that \t and \n collate lower
than a space), it's supposed to be this:

  (3) < (4) < (5) < (1) <= (2) < (6)

Postgres comes up with this:

  (1) <= (2) < (3) < (4) < (5) < (6)

The reason is that the bpchar functions that implement the relative
comparison operators for CHAR(n) effectively strip trailing whitespaces
before doing the comparison.  One might argue that doing this is not
correct.  The standard seems to mandate that all CHAR(n) values are
actually considered to be of width n, and that trailing spaces are indeed
relevant for comparison.  In other words, stripping them would only be
possible if it can be guaranteed that there are no characters in the
character set that collate lower than a space.

Any thoughts on this?  I searched the mailing list archives, but couldn't
find any relevant discussion.  There were plenty of threads that argue
whether or not it's semantically correct to strip trailing spaces from
CHAR(n) values, but the issue of characters collating below a space does
not seem to have brought up in any of those discussions before.

Cheers,

-- Thomas

Responses

pgsql-hackers by date

Next:From: Pavel StehuleDate: 2013-09-23 04:40:17
Subject: Re: Assertions in PL/PgSQL
Previous:From: Stas KelvichDate: 2013-09-22 23:38:48
Subject: Cube extension kNN support

pgsql-sql by date

Next:From: Luca FerrariDate: 2013-09-23 06:45:51
Subject: Re: the value of OLD on an initial row insert
Previous:From: James SharrettDate: 2013-09-20 16:43:47
Subject: the value of OLD on an initial row insert

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group