Skip site navigation (1) Skip section navigation (2)

Re: Unicode UTF-8 table formatting for psql text output

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, "Brad T(dot) Sliger" <brad(at)sliger(dot)org>,pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>,Selena Deckelmann <selenamarie(at)gmail(dot)com>,Alvaro Herrera <alvherre(at)commandprompt(dot)com>,Roger Leigh <rleigh(at)debian(dot)org>
Subject: Re: Unicode UTF-8 table formatting for psql text output
Date: 2009-09-30 14:41:11
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Tue, Sep 29, 2009 at 04:28:57PM -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > On Tue, 2009-09-29 at 12:01 -0400, Tom Lane wrote:
> >> The bigger question is exactly how we expect this stuff to interact with
> >> pg_regress' --no-locale switch.  We already do clear all these variables
> >> when --no-locale is specified.  I am wondering just what --locale is
> >> supposed to do, and whether selectively lobotomizing the LC stuff has
> >> any real use at all.
> > We should do the LANG or LC_CTYPE thing only on the client,
> > unconditionally.  The --no-locale/--locale options should primarily
> > determine what the temporary server uses.
> Well, that seems fairly reasonable, but it's going to require some
> refactoring of pg_regress.  The initialize_environment function
> determines what happens in both the client and the temp server.

Two possible approaches to fix the tests are as follows:

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index f2f9603..74cdaa2 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -711,8 +711,7 @@ initialize_environment(void)
 	 * is actually called.)
-	unsetenv("LC_ALL");
-	putenv("LC_MESSAGES=C");
+	putenv("LC_ALL=C");
 	 * Set multibyte as requested

Here we just force the locale to C.  This does have the disadvantage
that --no-locale is made redundant, and any tests which are dependent
upon locale (if any?) will be run in the C locale.

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index f2f9603..65fb49a 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -712,6 +712,7 @@ initialize_environment(void)
+	putenv("LC_CTYPE=C");

Here we set LC_CTYPE to C in addition to LC_MESSAGES (and for much the
same reasons).  However, if you test on non-C locales to check for
issues with other locale codesets, those tests are all going to be
forced to use ASCII.  Is this a problem in practice?

From the POV of my patch, it's working as designed: if the locale
codeset is UTF-8 it's outputting UTF-8.  But, due to the way the
test machinery is looking at the output, this is breaking the tests.
I'm not sure what I can do with my patch to make it behave differently
that is both compatible with its intended use and not break the tests--
this is really just breaking an assumption in the testing code that
the test output will always be ASCII.

Forcing the LC_CTYPE to C will force ASCII output and work around this
problem with the tests.  Another approach would be to let psql know
it's being run in a test environment with a PG_TEST or some other
environment variable which we can check for and use to turn off UTF-8
output if set.  Would that be better?


  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux   
 `. `'   Printing on GNU/Linux?
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

In response to


pgsql-hackers by date

Next:From: Andrew DunstanDate: 2009-09-30 14:47:57
Subject: Re: Unicode UTF-8 table formatting for psql text output
Previous:From: Alvaro HerreraDate: 2009-09-30 14:34:29
Subject: Re: TODO item: Allow more complex user/database default GUC settings

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group