Re: Unicode UTF-8 table formatting for psql text output

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unicode UTF-8 table formatting for psql text output
Date: 2009-10-26 23:33:40
Message-ID: 20091026233339.GC11903@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote:
> Roger Leigh <rleigh(at)codelibre(dot)net> writes:
> > On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
> >> Yeah. We can do what we like with the UTF8 format but I'm considerably
> >> more worried about the aspect of making random changes to the
> >> plain-ASCII output.
>
> > I checked (using strace)
> > gnumeric (via libgda and gnome-database-properties)
> > openoffice (oobase)
>
> Even if that were the entire universe of programs we cared about,
> whether their internal ODBC logic goes through psql isn't really
> the point here. What I'm worried about is somebody piping the
> text output of psql into another program.
>
> > On a related note, there's something odd with the pager code.
>
> Hm, what platform are you testing that on?

Debian GNU/Linux (unstable)
linux 2.6.30
eglibc 2.10.1
libreadline6 6.0.5
libncurses5 5.7
gcc 4.3.4

This is the trace of the broken write:

16206 write(1, " Name \342\224\202 Owner \342\224"..., 102) = 102
16206 write(1, "\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342
\224"..., 256) = 256
16206 write(1, "\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\n", 18) = 18

I'll attach the whole thing for reference. What's clear is that the
first write was *exactly* 256 bytes, which is what was requested,
presumably by libc stdio buffering (which shouldn't by itself be a
problem). Since we use 3-byte UTF-8 and 256/3 is 85 + 1 remainder,
this is where the wierd 85 char forced newline comes from. Since it
only happens when the terminal window is >85 chars, that's where I'm
assuming some odd termios influence comes from (though it might just
be the source of the window size and be completely innocent). The
fact that libc did the two separate writes kind of rules out termios
mangling the output post-write().

Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

Attachment Content-Type Size
trace text/plain 50.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-10-26 23:35:24 Re: per-tablespace random_page_cost/seq_page_cost
Previous Message Andrew Dunstan 2009-10-26 23:21:14 Re: Anonymous Code Blocks as Lambdas?