Skip site navigation (1) Skip section navigation (2)

Re: BUG #7493: Postmaster messages unreadable in a Windows console

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Alexander Law <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: BUG #7493: Postmaster messages unreadable in a Windows console
Date: 2013-02-10 23:47:30
Message-ID: 16160.1360540050@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-generalpgsql-hackers
Noah Misch <noah(at)leadboat(dot)com> writes:
> Following some actual testing, I see that we treat postgresql.conf values as
> byte sequences; any reinterpretation as encoded text happens later.  Hence,
> contrary to my earlier suspicion, your patch does not make that situation
> worse.  The present situation is bad; among other things, current_setting() is
> a vector for injecting invalid text data.  But unconditionally validating
> postgresql.conf values in the platform encoding would not be an improvement.
> Suppose you have a UTF-8 platform encoding and KOI8R databases.  You may wish
> to put KOI8R strings in a GUC, say search_path.  That's possible today; if we
> required that postgresql.conf conform to the platform encoding and no other,
> it would become impossible.  This area warrants improvement, but doing so will
> entail careful design.

The key problem, ISTM, is that it's not at all clear what encoding to
expect the incoming data to be in.  I'm concerned about trying to fix
that by assuming it's in some "platform encoding" --- for one thing,
while that might be a well-defined concept on Windows, I don't believe
it is anywhere else.

If we knew that postgresql.conf was stored in, say, UTF8, then it would
probably be possible to perform encoding conversion to get string
variables into the database encoding.  Perhaps we should allow some
magic syntax to tell us the encoding of a config file?

	file_encoding = 'utf8'	# must precede any non-ASCII in the file

There would still be a lot of practical problems to solve, like what to
do if we fail to convert some string into the database encoding.  But at
least the problems would be somewhat well-defined.

While we're thinking about this, it'd be nice to fix our handling (or
rather lack of handling) of encoding considerations for database names,
user names, and passwords.  I could imagine adding some sort of encoding
marker to connection request packets, which could fix the don't-know-
the-encoding problem as far as incoming data is concerned.  But how
shall we deal with storing the strings in shared catalogs, which have to
be readable from multiple databases possibly of different encodings?

			regards, tom lane


In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2013-02-11 01:17:32
Subject: Re: performance regression in 9.2 CTE with SRF function
Previous:From: Peter GeogheganDate: 2013-02-10 23:45:19
Subject: Re: pgbench --startup option

pgsql-bugs by date

Next:From: Heikki LinnakangasDate: 2013-02-11 12:11:07
Subject: Re: BUG #7865: Unexpected error code on insert of duplicate to composite primary key
Previous:From: John R PierceDate: 2013-02-10 23:30:33
Subject: Re:

pgsql-general by date

Next:From: ModulokDate: 2013-02-11 00:11:32
Subject: Can you create aliases in the psql shell?
Previous:From: Andrew TaylorDate: 2013-02-10 23:36:33
Subject: Re: var/log/postgresql deletion mystery Ubuntu 12.10

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group