encoding aliases

From: Vivek Khera <vivek(at)khera(dot)org>
To: Postgresql-General General <pgsql-general(at)postgresql(dot)org>
Subject: encoding aliases
Date: 2006-03-15 16:33:25
Message-ID: 079E28C2-3ED4-4514-AF95-8A341D7F6AF6@khera.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We're developing a DB that will be storing email messages. The clear
winner for the DB encoding is UTF8. However, I will need to set the
proper client encoding based on the encoding as defined in the email
message.

Looking at the docs (http://www.postgresql.org/docs/8.1/static/
multibyte.html), there are many encodings that I can use for the
client. However they do not match the canonical names used in
email. For example, WINDOWS-1252 is accepted, presumably as an alias
for WIN1252, though it is not listed as an alias. The commentary in
utils/mb/encnames.c indicates that the dashes are irrelevant, so we
know ISO-8859-1 and ISO88591 are equivalent.

I've only tried a handful of encoding values found in email so far,
but the only one that is not accepted is US-ASCII.

My only concern is that names like WINDOWS-1252 is really an alias
for WIN1252. What would make this 100% clear is if "SHOW
client_encoding" would report the canonical name rather than the name
passed to it. The source shows it is, but the docs do not.

So, is it fair to assume that the longer form names are safe to use
(ie, should I submit a doc patch)?

And does it make sense to make US-ASCII an alias for SQL-ASCII?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Darcy Buskermolen 2006-03-15 16:33:58 Re: [pgsql-advocacy] Wisconsin Circuit Court Access (WCCA) on
Previous Message Thomas Hallgren 2006-03-15 16:23:30 Re: Five reasons why you should never use PostgreSQL -- ever