Comparison semantics of CHAR data type

From: Thomas Fanghaenel <tfanghaenel(at)salesforce(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Comparison semantics of CHAR data type
Date: 2013-09-23 00:51:26
Message-ID: CAK+WP1xdmyswEehMuetNztM4H199Z1w9KWRHVMKzyyFM+hV=zA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

I was wondering about the proper semantics of CHAR comparisons in some
corner cases that involve control characters with values that are less than
0x20 (space).

Consider the following testcase:

===
create table t (a int, b char(10));

insert into t values (1, 'foo');
insert into t values (2, 'foo ');
insert into t values (3, E'foo\t');
insert into t values (4, E'foo\n');
insert into t values (5, E'foo \n');
insert into t values (6, 'foobar');

select * from t order by b;
===

What's the proper order of these string values in the CHAR domain? The way
I interpret the SQL Standard (and assuming that \t and \n collate lower
than a space), it's supposed to be this:

(3) < (4) < (5) < (1) <= (2) < (6)

Postgres comes up with this:

(1) <= (2) < (3) < (4) < (5) < (6)

The reason is that the bpchar functions that implement the relative
comparison operators for CHAR(n) effectively strip trailing whitespaces
before doing the comparison. One might argue that doing this is not
correct. The standard seems to mandate that all CHAR(n) values are
actually considered to be of width n, and that trailing spaces are indeed
relevant for comparison. In other words, stripping them would only be
possible if it can be guaranteed that there are no characters in the
character set that collate lower than a space.

Any thoughts on this? I searched the mailing list archives, but couldn't
find any relevant discussion. There were plenty of threads that argue
whether or not it's semantically correct to strip trailing spaces from
CHAR(n) values, but the issue of characters collating below a space does
not seem to have brought up in any of those discussions before.

Cheers,

-- Thomas

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-09-23 04:40:17 Re: Assertions in PL/PgSQL
Previous Message Stas Kelvich 2013-09-22 23:38:48 Cube extension kNN support

Browse pgsql-sql by date

  From Date Subject
Next Message Luca Ferrari 2013-09-23 06:45:51 Re: the value of OLD on an initial row insert
Previous Message James Sharrett 2013-09-20 16:43:47 the value of OLD on an initial row insert