From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: This approach to non-ASCII names does not work |
Date: | 2006-09-20 21:47:10 |
Message-ID: | 200609202347.11626.peter_e@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
Bruce Momjian wrote:
> The unusual thing is that though our docs web pages use a stated
> encoding as ISO-8859-1, the UTF8 number does generate the proper
> symbol in my browser (Mozilla), so I wonder if >255 codes are assumed
> to be UTF8.
These are two different things.
A numeric character reference picks the numbered character from the
document character set. The document character set is declared in the
document type declaration (and is therefore fixed by the standards
committee for all users). The document character sets for commonly
used SGML applications are:
HTML 3.2 Latin 1 (ISO 646 + ECMA 94)
HTML 4+ UCS (ISO 10646)
XML UCS (ISO 10646)
DocBook SGML Latin 1 (ISO 646 + ECMA 94)
If a font is available, an HTML application (browser) should be able to
process (display) any character from the document character set,
whether it arrives in plain or as a character entity.
Conversely, a character not in the document character set, such as a
non-Latin-1 character in DocBook SGML, cannot be processed, strictly
speaking.
The other thing you are talking about is the character *encoding* which
specifies how the sequence of bytes that makes up the document is to be
interpreted. Note that this happens before the document character set
is taken into consideration and is pretty much independent of it. For
example, knowledge of the character encoding is necessary to find
the "&" that starts entities. Not all character encodings are capable
of encoding all characters in the document character set, which is why
you need to use character entities to access characters outside the
encoding.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2006-09-20 22:48:57 | Re: This approach to non-ASCII names does not work |
Previous Message | Tom Lane | 2006-09-20 21:38:06 | Re: This approach to non-ASCII names does not work |