Unicode UTF-8 table formatting for psql text output

From: Roger Leigh <rleigh(at)debian(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Unicode UTF-8 table formatting for psql text output
Date: 2009-08-22 15:59:44
Message-ID: 1250956790-18404-1-git-send-email-rleigh@debian.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

psql currently uses the ASCII characters '-' '|' and '+' to render
tables. However, most modern terminal emulators are capable of
displaying characters other than ASCII, including box drawing
characters which can be used to create rather more pleasing and
readable tables than ASCII punctuation can achieve.

The following set of patches firstly teach psql how to draw nice
tables, by abstracting the characters used for formatting tables into
a simple structure. The table formatting can then be passed to
functions needing to know how to draw tables, and they can then use
the provided symbols rather than hard-coding ASCII punctuation.
Tables for ASCII and Unicode UTF-8 are provided.

The conversion of print_aligned_text is a very straightforward
substitution. print_aligned_vertical is slightly more complex because
its use of string offsets to overwrite the record number assume
"1 byte = 1 character", which is no longer valid for multibyte
encodings such as UTF-8. This has been refactored to split out the
line drawing out into an inline helper function
(print_aligned_vertical_line) which actually makes the code more like
print_aligned_text with its _print_horizontal_line helper, and
contains no assumptions about indexes into character arrays.

Examples of this in action are shown at the end of this mail.

Portability: UTF-8 strings are encoded as standard C octal escapes,
and so are portable. I added UTF-8 comments to show what the numbers
encode, which can be removed if needed. The code depends on
nl_langinfo(3) from <langinfo.h>, but this is #ifdef'd to allow
building on systems without support. By default, an ASCII table is
used, which will result in identical behaviour to the current psql.
However, if nl_langinfo is available, and it reports that the locale
codeset is UTF-8, it will switch to using Unicode UTF-8 box-drawing
characters, which draw identical tables to the current psql, just with
different characters.

Extensibility: The table formatting can potentially be used to support
other character sets containing other box drawing characters, for
example IBM CP 437 or 850. However, I have just stuck with UTF-8 for
now!

To follow:
[PATCH 1/6] psql: Abstract table formatting characters used for different line types.
[PATCH 2/6] psql: Add table formats for ASCII and UTF-8
[PATCH 3/6] psql: Create table format
[PATCH 4/6] psql: Pass table formatting object to text output functions
[PATCH 5/6] psql: print_aligned_text uses table formatting
[PATCH 6/6] psql: print_aligned_vertical uses table formatting

In the examples below, I think there's just one minor issue, which is
a leading '-' with border=0 and expanded=1, which I just noticed while
sending this mail. I'll tidy that up and send another patch.

This is something I really think makes psql more readable and more
usable, which I've been working on over the last couple of nights, and
so here it is for your comments and criticism. I hope you find it
useful!

Regards,
Roger

Examples:

rleigh=# \pset border 0
Border style is 0.
rleigh=# SELECT * FROM package_priorities;
id name
── ─────────
1 extra
2 important
3 optional
4 required
5 standard
(5 rows)

rleigh=# \pset border 1
Border style is 1.
rleigh=# SELECT * FROM package_priorities;
id │ name
────┼───────────
1 │ extra
2 │ important
3 │ optional
4 │ required
5 │ standard
(5 rows)

rleigh=# \pset border 2
Border style is 2.
rleigh=# SELECT * FROM package_priorities;
┌────┬───────────┐
│ id │ name │
├────┼───────────┤
│ 1 │ extra │
│ 2 │ important │
│ 3 │ optional │
│ 4 │ required │
│ 5 │ standard │
└────┴───────────┘
(5 rows)

rleigh=# \pset border 0
Border style is 0.
rleigh=# \pset expanded 1
Expanded display is on.
rleigh=# SELECT * FROM package_priorities;
─* Record 1
id 1
name extra
─* Record 2
id 2
name important
─* Record 3
id 3
name optional
─* Record 4
id 4
name required
─* Record 5
id 5
name standard

[ this might need a tiny tweak to remove the leading - ]

rleigh=# \pset border 1
Border style is 1.
rleigh=# SELECT * FROM package_priorities;
─[ RECORD 1 ]───
id │ 1
name │ extra
─[ RECORD 2 ]───
id │ 2
name │ important
─[ RECORD 3 ]───
id │ 3
name │ optional
─[ RECORD 4 ]───
id │ 4
name │ required
─[ RECORD 5 ]───
id │ 5
name │ standard

rleigh=# \pset border 2
Border style is 2.
rleigh=# SELECT * FROM package_priorities;
┌─[ RECORD 1 ]─────┐
│ id │ 1 │
│ name │ extra │
├─[ RECORD 2 ]─────┤
│ id │ 2 │
│ name │ important │
├─[ RECORD 3 ]─────┤
│ id │ 3 │
│ name │ optional │
├─[ RECORD 4 ]─────┤
│ id │ 4 │
│ name │ required │
├─[ RECORD 5 ]─────┤
│ id │ 5 │
│ name │ standard │
└──────┴───────────┘

rleigh=# \ds
List of relations
Schema │ Name │ Type │ Owner
────────┼───────────────────────────┼──────────┼────────
public │ architectures_id_seq │ sequence │ rleigh
public │ binaries_id_seq │ sequence │ rleigh
public │ components_id_seq │ sequence │ rleigh
public │ distributions_id_seq │ sequence │ rleigh
public │ package_priorities_id_seq │ sequence │ rleigh
public │ package_sections_id_seq │ sequence │ rleigh
public │ sections_id_seq │ sequence │ rleigh
public │ states_id_seq │ sequence │ rleigh
(8 rows)

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Roger Leigh 2009-08-22 15:59:45 [PATCH 1/6] psql: Abstract table formatting characters used for different line types.
Previous Message Tom Lane 2009-08-22 15:59:07 Resjunk sort columns, Heikki's index-only quals patch, and bug #5000