Skip site navigation (1) Skip section navigation (2)

Re: sgml cleanup: unescaped '>' characters

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
Cc: pgsql-docs <pgsql-docs(at)postgresql(dot)org>
Subject: Re: sgml cleanup: unescaped '>' characters
Date: 2011-08-30 17:36:19
Message-ID: 1314725779.11209.1.camel@vanquo.pezone.net (view raw or flat)
Thread:
Lists: pgsql-docs
On mån, 2011-08-29 at 18:22 -0500, Josh Kupershmidt wrote:
> >> The rewritten version picked up a few stylistic inconsistencies in the
> >> SGML, such as:
> >>  * breaking the trailing '>' of an SGML marker across lines. AFAIK
> >> this is legal, but is a bit inconsistent and just confuses simplistic
> >> tools like find_gt_lt
> >
> > The cases you show don't appear to be terribly useful, but I think on
> > occasion this can be necessary to work around some arcane whitespace
> > rules in SGML or XML.  (Just look at the generated HTML; it uses this
> > technique throughout.)
> 
> Hrm, well if the spurious whitespace isn't serving any purpose in
> these cases, why not just fix it to match the rest of SGML style?
> 
> >>  * using single quotes instead of double quotes to surround a node
> >> attribute, as in <orderedlist numeration='loweralpha'>
> >
> > It would be better if the tool could handle that, because sometimes you
> > want to use single quotes if the value contains double quotes.
> 
> It's trivial to adjust the regex I was using to ignore such cases. I'm
> just on about stylistic consistency here. If there's a reason to use
> single quotes, such as when the value contains double quotes, then
> that's fine -- but I don't think any of the cases I pointed out fall
> under that category.

I have committed your fixes relevant to these two points.

> >> as well as seemingly-invalid SGML, such as using '>' unescaped inside
> >> normal SGML entries.
> >
> > Unescaped > is valid, AFAIK.
> 
> Oh, that's interesting. I took a quick look at "The SGML FAQ book",
> page 73 [1], which supports this claim.
> 
> But I notice we've been fixing such issues in the recent past (e.g.
> commit d420ba2a2d4ea4831f89a3fd7ce86b05eff932ff). Don't we want to
> continue doing so? Not to mention the fact that we have
> ./src/tools/find_gt_lt, which while somewhat broken, has the
> ostensible goal of finding such problems in the SGML. Or do we want to
> stop worrying about '>' entirely, and rename find_gt_lt to find_lt,
> instead?

> [1] http://books.google.com/books?id=OyJHFJsnh10C&lpg=PA229&ots=DGkYDdvbhE&pg=PA73#v=onepage&q&f=false

I don't know what the rationale for this tool is.  I have never used it.
Clearly, the reference shows, and the tools we use confirm, that it is
not necessary to use it.




In response to

Responses

pgsql-docs by date

Next:From: Fujii MasaoDate: 2011-08-31 09:11:14
Subject: Lowercase the replication status in the document
Previous:From: Josh KupershmidtDate: 2011-08-29 23:22:49
Subject: Re: sgml cleanup: unescaped '>' characters

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group