Re: Unicode UTF-8 table formatting for psql text output

From: "Brad T(dot) Sliger" <brad(at)sliger(dot)org>
To: Roger Leigh <rleigh(at)codelibre(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Roger Leigh <rleigh(at)debian(dot)org>
Subject: Re: Unicode UTF-8 table formatting for psql text output
Date: 2009-10-03 00:34:16
Message-ID: 200910021734.17242.brad@sliger.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday 02 October 2009 04:21:35 Roger Leigh wrote:
> On Wed, Sep 30, 2009 at 06:50:46PM -0400, Tom Lane wrote:
> > Roger Leigh <rleigh(at)codelibre(dot)net> writes:
> > >> On Wed, 2009-09-30 at 11:03 -0400, Andrew Dunstan wrote:
> > >>> Thinking about this some more, ISTM a much better way of approaching
> > >>> it would be to provide a flag for psql to turn off the fancy
> > >>> formatting, and have pg_regress use that flag.
> > >
> > > The attached patch implements this feature. It adds a
> > > --no-pretty-formatting/-G option to psql (naming isn't my fort,
> > > so feel free to change it!). This is also documented in the
> > > SGML docs, and help text. Lastly, this option is used when invoking
> > > psql by pg_regress, which results in a working testsuite in a UTF-8
> > > locale.
> >
> > It would be a good idea to tie this to a psql magic variable (like
> > ON_ERROR_STOP) so that it could conveniently be set in ~/.psqlrc.
> > I'm not actually sure that we need a dedicated command line switch
> > for it, since you could use "-v varname" instead.
>
> I have attached a patch which implements the feature as a pset
> variable. This also slightly simplifies some of the patch since
> the table style is passed to functions directly in printTableContent
> rather than separately. The psql option '-P tablestyle=ascii' is
> passed to psql in pg_regress_main.c which means the testsuite doesn't
> fail any more. The option is documented in the psql docs, and is
> also tab-completed. Users can just put '\pset tablestyle ascii' in
> their .psqlrc if they want the old format in a UTF-8 locale.
>
> To follow up on the comments about the problems of defaulting to
> UTF-8. There are just two potential problems with defaulting, both of
> which are problems with the user's mis-configuration of their system
> and (IMHO) not something that postgresql needs to care about.
> 1) The user lacks a font containing the line-drawing characters.
> It's very rare for a fixed-width terminal font to not contain
> these characters, and the patch as provided sticks to the most
> basic range from the VT100 set which are most commonly provided.
> 2) The user's terminal emulator is not configured to accept UTF-8
> input. If you're using a UTF-8 locale, then this is necessary
> to display non-ASCII characters, and is done automatically by
> almost every terminal emulator out there (on Linux, they default
> to using nl_langinfo(CODESET) unless configured otherwise, which
> is a very simple change if required). On any current GNU/Linux
> system, you have to go out of your way to break the defaults.
>
> The patch currently switches to UTF-8 automatically /when available/.
> IMO this is the correct behaviour since it will work for all but the
> handful of users who misconfigured their system, and provides an
> immediate benefit. We spent years making UTF-8 work out of the box on
> Linux and Unix systems, and it seems a trifle unfair to penalise all
> users for the sake of the few who just didn't set up their terminal
> emulator correctly (their setup is already broken, since non-ASCII
> characters returned by queries are /already/ going to be displayed
> incorrectly).
>
>
> Regards,
> Roger

I looked at psql-utf8-table-5.patch.

Lint(1) says there is an extra trailing ',' in src/bin/psql/print.h. in 'typedef enum printTextRule'. The addition to
src/bin/psql/command.c could use a comment, like adjacent code.

'ASCII' and 'UTF8' may need <acronym></acronym> tags in doc/src/sgml/ref/psql-ref.sgml, like adjacent
code. I'm not sure someone who hasn't seen this patch in action would immediately know what it does from the
documentation. `gmake html` works without the patch, but fails with the patch:

openjade:ref/psql-ref.sgml:1692:15:E: document type does not allow element "TERM" here; assuming
missing "VARLISTENTRY" start-tag

After the patch, `\pset format wrapped` produces '\pset: unknown option: format'. I saw this in interactive psql
and from .psqlrc. I think this can be fixed by changing the addition to src/bin/psql/command.c from an 'if' clause to
an 'else if' clause.

Otherwise, the patch applied, built and installed. The `gmake check` tests all passed with LANG and/or LC_ALL
set. The various tablestyle options seem to work. The default behavior with respect to various LANG and LC_ALL
values seems reasonable and can be overridden.

Thanks,

--bts

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2009-10-03 01:01:28 Re: CommitFest 2009-09, two weeks on
Previous Message Bruce Momjian 2009-10-02 22:56:09 Re: Rejecting weak passwords