Re: Client Messages

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Client Messages
Date: 2012-01-26 18:58:58
Message-ID: 4F21A272.3000703@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26.01.2012 17:31, Tom Lane wrote:
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> The thing is, there's currently no encoding conversion happening, so if
>> you have one database in LATIN1 encoding and another in UTF-8, for
>> example, whatever you put in your postgresql.conf is going to be wrong
>> for one database. I'm happy to just document the issue for per-database
>> messages, "ALTER DATABASE ... SET welcome_message", the encoding used
>> there need to match the encoding of the database, or it's displayed as
>> garbage. But what about per-user messages, when the user has access to
>> several databases, or postgresql.conf?
>
> I've not looked at the patch, but what exactly will happen if the string
> has the wrong encoding?

You get an incorrectly encoded string, ie. garbage, in your console,
when you log in with psql.

You can also use current_setting() to copy the incorrectly-encoded
string elsewhere in the system. If you insert it into a table and run
pg_dump, I think the dump might not be restorable. That's a bit of a
stretch, perhaps, but it would be nice to avoid that.

BTW, you can already do that if you set e.g default_text_search_config
to something non-ASCII in postgresql.conf. Or if you do it with
search_path, you get a warning at login. For example, I did "ALTER USER
foouser set search_path ='kääk';" in a LATIN1 database, and then
connected to a UTF-8 database and got:

$ ~/pgsql.master/bin/psql postgres foouser
WARNING: invalid value for parameter "search_path": ""k��k""
DETAIL: schema "k��k" does not exist
psql (9.2devel)
Type "help" for help.

(in case that didn't get across right, I set the search_path to a string
containing two a-with-umlauts, and in the warning, they got replaced
with question marks with inverse colors, which is apparently a character
that the console uses to display bytes that are not valid UTF-8).

The problem with welcome_message would look just like that. No-one is
likely to run into that with search_path, but it's quite reasonable and
expected to use your native language in a welcome message.

> The idea that occurs to me is to have the code that uses the GUC do a
> verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass
> (maybe with a LOG message). This would have to be documented of course,
> but it seems better than the potential consequences of trying to send a
> wrongly-encoded string.

Hmm, fine with me. It would be nice to plug the hole that these bogus
characters can leak elsewhere into the system through current_setting,
though. Perhaps we could put the verify_mbstr() call somewhere in guc.c,
to forbid incorrectly encoded characters from being stored in the guc
variable in the first place.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-01-26 18:59:16 Re: Different error messages executing CREATE TABLE or ALTER TABLE to create a column "xmin"
Previous Message Vik Reykja 2012-01-26 18:13:17 Re: Different error messages executing CREATE TABLE or ALTER TABLE to create a column "xmin"