Re: Unicode UTF-8 table formatting for psql text output

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: "Brad T(dot) Sliger" <brad(at)sliger(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Roger Leigh <rleigh(at)debian(dot)org>
Subject: Re: Unicode UTF-8 table formatting for psql text output
Date: 2009-10-05 19:39:42
Message-ID: 20091005193942.GB12531@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 04, 2009 at 11:22:27PM +0300, Peter Eisentraut wrote:
> I have a comment on this bit:
>
> > @@ -125,6 +128,17 @@ main(int argc, char *argv[])
> >
> > /* We rely on unmentioned fields of pset.popt to start out
> > 0/false/NULL */
> > pset.popt.topt.format = PRINT_ALIGNED;
> > +
> > + /* Default table style to plain ASCII */
> > + pset.popt.topt.table_style = &asciiformat;
> > +#if (defined(HAVE_LANGINFO_H) && defined(CODESET))
> > + /* If a UTF-8 locale is available, switch to UTF-8 box drawing
> > characters */
> > + if (pg_strcasecmp(nl_langinfo(CODESET), "UTF-8") == 0 ||
> > + pg_strcasecmp(nl_langinfo(CODESET), "utf8") == 0 ||
> > + pg_strcasecmp(nl_langinfo(CODESET), "CP65001") == 0)
> > + pset.popt.topt.table_style = &utf8format;
> > +#endif
> > +
> > pset.popt.topt.border = 1;
> > pset.popt.topt.pager = 1;
> > pset.popt.topt.start_table = true;
>
> Elsewhere in the psql code, notably in mbprint.c, we make the decision
> on whether to apply certain Unicode-aware processing based on whether
> the client encoding is UTF8. The same should be done here.
>
> There is a patch somewhere in the pipeline that would automatically set
> the psql client encoding to whatever the locale says, but until that is
> done, the client encoding should be the sole setting that rules what
> kind of character set processing is done on the client side.

OK, that makes sense to a certain extent. However, the characters
used to draw the table lines are not really that related to the
client encoding for data sent from the database (IMHO).

I think that (as you said) making the client encoding the same as the
locale character set the same in the future would clear up this
discrepancy though. Using the client encoding, there's no guarantee
the client locale/terminal can handle UTF-8 when the client encoding is
UTF-8.

I have attached an updated patch which implements your suggested
behaviour. It also renames the option to "linestyle" rather than
"tablestyle" which I think represents its purpose a bit more clearly.

Thanks,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

Attachment Content-Type Size
psql-utf8-table-7.patch text/x-diff 20.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-10-05 19:41:23 Re: dblink memory leak
Previous Message Tom Lane 2009-10-05 19:38:42 Re: [PATCH] DefaultACLs