Re: Bug in concat operator for Char? -- More Info

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: josh(at)agliodbs(dot)com, pgsql-bugs(at)postgresql(dot)org, glenniii(at)mail(dot)utexas(dot)edu
Subject: Re: Bug in concat operator for Char? -- More Info
Date: 2004-07-21 15:49:38
Message-ID: 20040721082410.F62747@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


On Wed, 21 Jul 2004, Tom Lane wrote:

> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> > On Wed, 21 Jul 2004, Tom Lane wrote:
> >> Yes. We've been moving more and more steadily towards the notion that
> >> trailing spaces in char(n) values are insignificant noise. If you think
> >> that trailing spaces are significant, you shouldn't be using char(n)
> >> to store them.
>
> > Well, the problem here is that technically we're returning the wrong type.
> > We should be returning a char(l1+l2) rather than a text for a char
> > concatenate, but similarly to the recent complaint about numerics, we
> > don't really have a fully proper way to do that and it seems non-trivial.
>
> Well, it'd be trivial to implement a char || char yielding char
> operator; it could just point to the existing textcat function and
> you'd get what you want. (It would come out as char(-1), ie unspecified
> length, but I'm not buying into doing the kind of analysis it would take
> to predict the length.) The real question in my mind is whether that

The reason that to do it completely means knowing the length comes from
case and union afaics. Both of these need to do something consistent with
the lengths.

case when <blah> then 'f'::char(2) || 'g'::char(2) else
'f'::char(3) || 'g'::char(3) end
should return a consistent length char no matter which branch is taken on
any given row.

This was the basic complaint with numeric in the -sql thread, we return
the "correct" actual numeric values with proper seeming precision and
scale, but if you then case two of these that gave different precision and
scale, you'd get inconsistent scale in the case output.

> Food for thought: in 7.4,
>
> regression=# select ('X '::char) = ('X'::char);
> ?column?
> ----------
> t
> (1 row)
>
> regression=# select ('Y '::char) = ('Y'::char);
> ?column?
> ----------
> t
> (1 row)
>
> regression=# select ('X '::char || 'Y '::char) = ('X'::char || 'Y'::char);
> ?column?
> ----------
> t
> (1 row)
>
> If we change || as is proposed in this thread, then the last case would
> yield 'false', because the first concatenation would yield 'X Y '
> which is not equal to 'XY' no matter what you think about trailing
> spaces. I find it a bit disturbing that the concatenation of equal
> values would yield unequal values.

That is somewhat bad, yeah.

> IMHO the bottom line here is that the SQL-spec behavior of type char(N)
> is completely brain-dead. Practically all of the questions in this area
> would go away if people used varchar(N) or text to store their data.

It is fairly wierd, yes. I'm not sure if the spec lets you, but a NO PAD
default character set probably would have made this simpler, by not
requiring that 'Y'::char(4) is equal to 'Y'::char(2), but it's really too
late to change that now in any case.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andreas Pflug 2004-07-21 16:13:32 Re: Bug in concat operator for Char? -- More Info
Previous Message Tom Lane 2004-07-21 15:22:26 Re: Bug in concat operator for Char? -- More Info