Automatic detection of client encoding

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Automatic detection of client encoding
Date: 2003-05-28 21:56:07
Message-ID: Pine.LNX.4.44.0305281722190.2023-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It is a common problem that a server uses a nontrivial character set
encoding (e.g., Unicode) but users forget to set an appropriate
client-side encoding. Then they get bogus displays for non-ASCII
characters because their client isn't actually prepared for Unicode.

There is a standard interface (SUSv2) for detecting the character set
based on the locale settings. I suggest we use this (if available) in
applications like psql and pg_dump by default unless it is overridden by
the usual mechanisms. If the character set name obtained this way is not
recognized by PostgreSQL, we fall back to SQL_ASCII.

Here's a piece of code that shows how this would work:

#include <stdio.h>
#include <locale.h>
#include <langinfo.h>

int
main(int argc, char *argv[])
{
setlocale(LC_ALL, "");
printf("%s\n", nl_langinfo(CODESET));
return 0;
}

(LC_CTYPE is the governing category for this.)

Comments?

--
Peter Eisentraut peter_e(at)gmx(dot)net

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2003-05-28 23:16:07 Re: Automatic detection of client encoding
Previous Message Yurgis Baykshtis 2003-05-28 20:26:51 Re: Mismatched parentheses when creating a rule with multiple action queries