Re: UTF-8 docs

From: Jürgen Purtz <juergen(at)purtz(dot)de>
To: pgsql-docs(at)postgresql(dot)org
Cc: vitus(at)wagner(dot)pp(dot)ru
Subject: Re: UTF-8 docs
Date: 2016-08-23 13:57:31
Message-ID: a65e7fdf-c7a9-6106-307d-2fab50981c74@purtz.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

In the previous mails we have seen some statements concerning the source
format of postgres' documentation and other statements to formats which
are derived from it. In the following I'm only speaking about the
original format. Premised this, I want to second Victor Wagner, who
wrote on pgsql-hackers:

> Really, what change we need, it is conversion from SGML to XML format.
> It would solve some real problems, such as ability to include diagrams
> in the docs, and also let everyone to explicitely specify encoding in
> XML declaration (and probably cause switch to UTF-8 as side effect,
> because most XML-based tools use UTF-8 as default).

The real fundamental step is the switch from SGML to XML. He consists
not only in a change of the markup format (omittag, shorttag). We must
also replace SGML tools for parsing, validating and generating diverse
output formats like HTML or PDF with modern XML tools. And we need
additional XSLT steps or modifications of the CSS files to replace the
DSSSL scripts. This work is in progress.

After we got rid of all SGML related parts we can profit from the actual
XML tools and standards, eg.:

- Docbook itself is moving from 4.x to 5.x on the basis of XML.
(Actually I don't recommend this additional step because of some
incompatibilities in the migration to 5.x, see:
https://lists.oasis-open.org/archives/docbook/201606/msg00007.html )

- The common attribute "xml:lang" for translations

- Extensions like XInclude, SVG, MathML, ...

- ...

On 23.08.2016 00:51, Tatsuo Ishii wrote:
> From: Alexander Law<exclusion(at)gmail(dot)com>
> Subject: UTF-8 docs
> Date: Mon, 22 Aug 2016 16:36:14 +0300
> Message-ID:<7fbf2e80-9507-0521-d0e9-913ab81a58df(at)gmail(dot)com>
>
>> Hello,
>> I've just seen a discussion about docs endoding in pgsql-hackers.
>>
>> https://www.postgresql.org/message-id/20160822.141645.655870136709055853.t-ishii%40sraoss.co.jp
>> Can we continue the discussion in this mailing list?
>> We (at Postgres Pro) have developed the whole build chain (with
>> support for l10n) so we can just share it.
> I have been just subscribed to the pgsql-docs list.
> Here is the last conversation with Peter at pgsql-hackers.
>
>> On 8/22/16 9:32 AM, Tatsuo Ishii wrote:
>>> I don't know what kind of problem you are seeing with encoding
>>> handling, but at least UTF-8 is working for Japanese, French and
>>> Russian.
>> Those translations are using DocBook XML.
> But in the mean time I can create UTF-8 HTML files like this:
>
> make html
> [snip]
> /bin/mkdir -p html
> SP_CHARSET_FIXED=1 SP_ENCODING=UTF-8 openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t sgml -i output-html -i include-index postgres.sgml
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English:http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
>

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Alexander Law 2016-08-23 14:23:15 Re: Docbook 5.x
Previous Message Tatsuo Ishii 2016-08-22 22:51:31 Re: UTF-8 docs