Re: non-ASCII characters in SGML documentation (and elsewhere)

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: non-ASCII characters in SGML documentation (and elsewhere)
Date: 2011-06-01 19:28:22
Message-ID: 1306956502.2279.6.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On fre, 2011-05-20 at 08:16 -0400, Alvaro Herrera wrote:
> > > * Should we allow/use non-ASCII characters in the release
> notes?
> > > * What encoding should the HISTORY file have?
> >
> > Ideally "sure, if entity-ified", but I don't know what to do about
> > HISTORY.
>
> Can we recode that to plain ascii? I think iconv has a //TRANSLIT
> flag or something like that.

To make this work on FreeBSD, where we build the releases, we need to
use the following command:

"/usr/bin/perl" -p -e 's/<H(1|2)$/<H\1 align=center/g' HISTORY.html | LC_ALL=en_US.ISO8859-1 lynx -force_html -dump -nolist -stdin | iconv -f latin1 -t us-ascii//TRANSLIT > HISTORY

This also works on Linux/glibc, but FreeBSD is a bit stricter/more
limited. Not sure about other platforms, but I'd guess if they don't
have the required locales, they'd be no worse off than now anyway.

The results are reasonable. It actually depends on the platform
what //TRANSLIT does, e.g. on FreeBSD ö -> "o, on Linux ö -> o.

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Robert Haas 2011-06-02 17:39:41 Re: BUG #5926: information schema dtd_identifier for element_types, columns, parameters views inconsistent
Previous Message Andrew Dunstan 2011-06-01 17:19:53 Re: [HACKERS] DOCS: SGML identifier may not exceed 44 characters