Re: buildfarm / handling (undefined) locales

From: Christoph Berg <cb(at)df7cb(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: buildfarm / handling (undefined) locales
Date: 2014-05-13 20:40:13
Message-ID: 20140513204013.GA21331@msgid.df7cb.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Re: Tom Lane 2014-05-13 <27525(dot)1400012096(at)sss(dot)pgh(dot)pa(dot)us>
> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> > On 05/13/2014 09:58 PM, Tom Lane wrote:
> >> ... If so the issue is presumably
> >> that the environment variable(s) were set to incorrect values. While
> >> we *could* abort in that situation, I've never heard of any program
> >> that did; the normal response is to silently ignore the environment
> >> variables and use C locale. We're not being exactly silent about it
> >> but I think the outcome is the expected one.
>
> > Initdb isn't like most programs. The locale given to initdb is memorized
> > in the data directory, and if you later notice that it was wrong, you'll
> > have to dump and reload. There is a strong argument for initdb to be
> > more strict than, say, your average text editor.
>
> Hm, well, if that's the behavior we want then it's certainly an easy
> change.

It should definitely fail. If you have some LC_ variables set, you
want to store that charset in your database. If the DB ends up using
C, that's not helpful. (Or probably even worse, as SQL_ASCII will
accept binary garbage without checking anything, so you'll only notice
when it's too late.)

Bad locales are the #1 reason for initdb problems at install time for
Debian packages - while pg_createcluster catches some of these itself
before invoking initdb, making the process more deterministic would be
a good thing.

> But independently of whether it's a fatal error or not: when there's
> no relevant command-line argument then we print the
>
> invalid locale name ""
>
> message which is surely pretty unhelpful. It'd be better if we could
> finger the incorrect environment setting. Unfortunately, we don't know
> for sure which environment variable(s) setlocale was looking at --- I
> believe it's somewhat platform specific. We could probably print
> something like this instead:
>
> environment locale settings are invalid

Definitely a good plan. The current behavior is just not helpful:

$ LANG=de_DE.utf-9 /usr/lib/postgresql/9.4/bin/initdb -D /tmp/bar
The files belonging to this database system will be owned by user "cbe".
This user must also own the server process.

initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
initdb: invalid locale name ""
The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Christoph
--
cb(at)df7cb(dot)de | http://www.df7cb.de/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-05-13 20:48:35 Re: buildfarm / handling (undefined) locales
Previous Message Rohit Goyal 2014-05-13 20:39:30 Re: Error in running DBT2