From: | Chapman Flack <jcflack(at)acm(dot)org> |
---|---|
To: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: XMLDocument (SQL/XML X030) |
Date: | 2025-01-20 19:09:27 |
Message-ID: | 678E9F67.5000709@acm.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/20/25 06:02, Jim Jones wrote:
> The DB2 "Document node constructors" might provide some insights into
> its behavior regarding well-formed XML documents [1]:
>
> "No validation is performed on the constructed document node. The XQuery
> document node constructor does not enforce the XML 1.0 rules that govern
> the structure of an XML document. For example, a document node is not
> required to have exactly one child that is an element node."
>
> This suggests that DB2's design reflects a different approach to
> handling XML, focusing less on enforcing XML 1.0 constraints. It appears
> to be more of a design philosophy regarding how XML is integrated into
> the database system as a whole, rather than just a difference in the
> implementation of the XMLDocument function.
Indeed. ISO SQL/XML changed significantly between the 2003 edition
(largely followed by PostgreSQL) and the 2006 and all later editions.
There's a rundown of those changes at [3].
> PostgreSQL does not support the RETURNING SEQUENCE or RETURNING CONTENT
> clauses explicitly. Instead, it implicitly uses RETURNING CONTENT[2] in
> functions that require it. Since RETURNING CONTENT implies that the
> output is a well-formed XML document (e.g., single-rooted),
In fact, you can't infer single-root-element-ness from RETURNING CONTENT,
according to the standard. Single-root-element-ness is checked by the
IS DOCUMENT predicate, and by XMLPARSE and XMLSERIALIZE when they specify
DOCUMENT. But it isn't checked or implied by the XMLDOCUMENT constructor.
That amounts to a bit of unfortunate punning on the word DOCUMENT,
but so help me that's what's in the standard.
It may help to think in terms of the hierarchy of XML types that the
2006 standard introduced (cribbed here from [3]):
SEQUENCE
|
(?sequence of length 1, a document node)
|
CONTENT(ANY)----------------.----------------(?every element
| | conforms to a
(?every element has (?no extraneous schema)
xdt:untyped and !nilled, nodes) |
every attribute has | |
xdt:untypedAtomic) DOCUMENT(ANY) CONTENT(XMLSCHEMA)
| |
CONTENT(UNTYPED) (?whole thing is valid
| according to schema)
(?no extraneous nodes) |
| DOCUMENT(XMLSCHEMA)
DOCUMENT(UNTYPED)
where the condition (?no extraneous nodes) is shorthand for SQL/XML's
more precise "whose `children` property has exactly one XQuery element
node, zero or more XQuery comment nodes, and zero or more XQuery
processing instruction nodes".
So that (?no extraneous nodes) condition is required for any of
the XML(DOCUMENT...) types. When you relax that condition, you have
an XML(CONTENT...) type.
The XMLDOCUMENT constructor is so named because it constructs what
corresponds to an XQuery document node—which actually corresponds to
the XML(CONTENT...) SQL/XML types, and does not enforce having a
single root element:
"This data model is more permissive: a Document Node may be empty,
it may have more than one Element Node as a child, and it also
permits Text Nodes as children."[4]
So in terms of the SQL/XML type hierarchy, what you get back from
XMLDOCUMENT ... RETURNING CONTENT will have one of the XML(CONTENT...)
types (whether it's CONTENT(ANY) or CONTENT(UNTYPED) is left to the
implementation).
If you then want to know if it is single-rooted, you can apply the
IS DOCUMENT predicate, or try to cast it to an XML(DOCUMENT...) type.
(And if you use XMLDOCUMENT ... RETURNING SEQUENCE, then you get a
value of type XML(SEQUENCE). The sequence has length 1, a document
node, making it safely castable to XML(CONTENT(ANY)), but whether
you can cast it to an XML(DOCUMENT...) type will depend on what
children that document node has.)
Long story short, an XMLDOCUMENT constructor that enforced having
a single root element would be nonconformant.
Regards,
-Chap
> 1 - https://www.ibm.com/docs/en/db2/11.1?topic=constructors-document-node
> 2 - https://www.postgresql.org/docs/17/xml-limits-conformance.html
3 -
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#SQL.2FXML:2003_contrasted_with_SQL.2FXML_since_2006
4 - https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode
From | Date | Subject | |
---|---|---|---|
Next Message | Oliver Ford | 2025-01-20 19:09:38 | Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options |
Previous Message | Jim Jones | 2025-01-20 19:04:44 | Re: Add XMLNamespaces to XMLElement |