Proposal: syntax highlight in html manual

From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: pgsql-docs(at)postgresql(dot)org
Subject: Proposal: syntax highlight in html manual
Date: 2011-04-13 10:31:45
Message-ID: BANLkTi=C=-zur9gxi75o45Mq38dBsf4Bvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hello,

when I wrote the docs for the GMP extension
(http://pgmp.projects.postgresql.org/) I started improving the syntax
highlight produced by pygments
(https://bitbucket.org/dvarrazzo/pygments-postgres) for the PostgreSQL
SQL dialect. I've also added specific lexers for PL/pgSQL and
interactive psql sessions. The lexer handles all the PG constructs,
has a list of keywords and datatypes parsed from the docs, and can
also dispatch the content of a $$ string to a different lexer based on
a related LANGUAGE keyword nearby (e.g. highlighting a PL/Python
function using the Python lexer). Here you can see the result on the
"regression document" I am using to develop the lexer:

- http://pgmp.projects.postgresql.org/highlight/psql.html
- http://pgmp.projects.postgresql.org/highlight/postgres.html

Is there any interest in applying highlighted syntax to the html
rendering of the manual?

If there is, I think the rendering should be performed as a
post-processing step on the html output and should be a totally
optional phase: we may do it for the website but have the docs
generation not to fail if the tools (python, pygments) are missing.
There would be the need to tag every snippet in the docs with the
correct language: I think the correct way is to use the "role"
attribute in the docbook tags generating the snippets (screen,
programlisting, synopsis...): its value can be propagated to the html
(e.g. as a css class) using a suitable docbook configuration (see
<http://www.sagehill.net/docbookxsl/HtmlCustomEx.html#CustomClassValues>,
albeit a test I've done in that direction failed - but I'm completely
clueless about debugging the docbooc tool chain).

I've scraped all the docs snippets (about 3k) into a database and
written an interactive tool to tag them with a language: I'm using the
tool to find snippets to test the lexer with and immediately check the
result. Examples here (these are static pages, not the live tool):

- http://pgmp.projects.postgresql.org/highlight/snippets/plpgsql-control-structures.html
- http://pgmp.projects.postgresql.org/highlight/snippets/plpython-data.html

The result of the tagging may be used to patch the docs, injecting the
role in the sgml source.

If you like the idea I can work at the missing parts, e.g. fixing the
ambiguities in the examples (missing psql prompts etc), tag all the
snippets, write the script to postprocess the html (which should also
put back the result of the docbook semantic tagging to have e.g.
"replaceable" rendered in italic), maybe have a specific lexer for the
synopsis etc. Otherwise I'll just hack on the lexer until I'm happy
and contribute it back to the pygments project. On your side I just
expect to have the role propagation fixed in the xslt and of course to
accept the doc patches produced.

Let me know if you like the idea.

Best regards,

-- Daniele

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2011-04-13 15:02:02 Re: Proposal: syntax highlight in html manual
Previous Message Tom Lane 2011-04-13 07:00:03 Re: CREATE EXTENSION: documenting prereqs