Re: Array access to type "name"

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Array access to type "name"
Date: 2003-04-27 17:26:22
Message-ID: Pine.LNX.4.44.0304271823240.2298-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane writes:

> I'm not having any luck duplicating that here, but in any case what the
> above suggests to me is lack of robustness in the output conversion
> chain for type "char". Or do you want to legislate that byte values
> corresponding to the first bytes of multibyte character sequences are
> illegal values for type "char"? I'd have a problem with that ...

I think it comes down to defining what we really want. Clearly, "char" is
a byte, not a character, much like in C. Perhaps we should adopt the
bytea escape mechanism for "char" values above 127. Otherwise, what gets
stored and what gets printed out both depends on character set conversion
issues, which seems yucky.

Now you can define name[x] to be the x'th *byte* of name, but that seems
contrived and inconsistent with the original purpose, because whether you
get useful or garbage values depends on the character set encoding. If
you want to select the x'th character, use substring(), if you want access
to bytes, use bytea. The character set encoding is an internal matter
that should not be accessible to users.

Btw., the issue is even a bit more serious than the example I posted:

$ dropdb test
$ createdb -E UNICODE test
$ psql test
=> create table åland (a int);
=> \d
ERROR: Could not convert UTF-8 to ISO8859-1

(Latest sources.)

--
Peter Eisentraut peter_e(at)gmx(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-04-27 17:35:17 Re: current breakage with PGCLIENTENCODING
Previous Message Tom Lane 2003-04-27 16:26:25 Re: current breakage with PGCLIENTENCODING