Large SGML Cleanup

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: pgsql-docs(at)postgresql(dot)org
Subject: Large SGML Cleanup
Date: 2010-11-03 02:56:26
Message-ID: AANLkTi=1Sm9N3Khiued9UiMfdd_TKLimMiO9mCfHtL39@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

[Resending without large attachment, looks like the previous attempt
isn't going to make it]

Hi all,

I've gone through the SGML documentation, trying to push the output
HTML towards HTML 4.01 compliance. By far the most common problem I
found was incorrect nesting of <para> nodes, which results in invalid
HTML.

A common idiom I encountered was SGML like this:

<para>
...
<simplelist>
...
</simplelist>
...
</para>

This SGML would then produce HTML which looked like this:

<p>
...
<table>
...
</table>
...
</p>

This HTML fails validation, as one isn't supposed to be stuffing
tables inside <p> nodes. The attached patch fixes all the instances of
this I could find, by closing out <para> nodes before beginning lists
and tables.

I used the w3c-markup-validator package and the web service at
validator.w3.org to test HTML validity. A handy Perl package I found
for this was WebService::Validator, which includes the example script
"validate_files_in_dir.pl" to easily validate a directory full of html
files. With this patch, the number of invalid HTML files has been
reduced to 16 from many dozens.

Patch at:
http://kupershmidt.org/pg/sgml_fixup.patch.gz

Josh

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2010-11-03 03:15:26 Re: Large SGML Cleanup
Previous Message Katharina kuhn 2010-11-02 18:35:32 Re: CREATE CUSTOM TEXT SEARCH PARSER