Re: [GENERAL] 'a' == 'a '

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard_D_Levine(at)raytheon(dot)com, general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] 'a' == 'a '
Date: 2005-10-25 02:43:16
Message-ID: 200510250243.j9P2hGJ07683@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dann Corbit wrote:
> > But isn't collating sequence related to ordering? How does this
> relate
> > to padding?
>
> Right. Collating sequence is how ordering is defined. But when you
> compare two character types, they are supposed to pad according to the
> collating sequence. So whether you blank fill or pad with some special
> character when performing a comparison is defined by the collating
> sequence and not by the character type. Since we see (for instance)
> that bpchar(n) and varchar(n) pad differently when performing a
> comparison, we must assume that they have a different collating
> sequence. So the question is "what is it?"
>
> It is always possible that I have misread the standard.

OK, I understand now. It is tempting to think that the difference
between char() and varchar() is that internally they use a different
collating sequences, but that isn't the case. If it were, space would
be ignored during comparisons any place in the string, when in fact, is
it is only trailing space that char() ignores, e.g.:

test=> SELECT 'a '::CHAR(10) = 'a'::CHAR(10);
?column?
----------
t
(1 row)

test=> SELECT 'a '::VARCHAR(10) = 'a'::VARCHAR(10);
?column?
----------
f
(1 row)

test=> SELECT 'a'::CHAR(10) = ' a'::CHAR(10);
?column?
----------
f
(1 row)

test=> SELECT 'a'::VARCHAR(10) = ' a'::VARCHAR(10);
?column?
----------
f
(1 row)

Our docs already have:

http://candle.pha.pa.us/main/writings/pgsql/sgml/datatype-character.html

Values of type character are physically padded with spaces to the
specified width n, and are stored and displayed that way. However, the
padding spaces are treated as semantically insignificant. Trailing
spaces are disregarded when comparing two values of type character, and
they will be removed when converting a character value to one of the
other string types. Note that trailing spaces are semantically
significant in character varying and text values.

What additional documentation is needed?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Klint Gore 2005-10-25 02:48:10 Re: BUG #1993: Adding/subtracting negative time intervals
Previous Message Tom Lane 2005-10-25 02:40:32 Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance