From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
---|---|
To: | Chapman Flack <jcflack(at)acm(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: XMLDocument (SQL/XML X030) |
Date: | 2025-01-20 19:56:28 |
Message-ID: | f44eb6cc-e0be-4041-a374-6231b9dcaefb@uni-muenster.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Chap,
Thanks for the thorough explanation!
On 20.01.25 20:09, Chapman Flack wrote:
>> PostgreSQL does not support the RETURNING SEQUENCE or RETURNING CONTENT
>> clauses explicitly. Instead, it implicitly uses RETURNING CONTENT[2] in
>> functions that require it. Since RETURNING CONTENT implies that the
>> output is a well-formed XML document (e.g., single-rooted),
> In fact, you can't infer single-root-element-ness from RETURNING CONTENT,
> according to the standard. Single-root-element-ness is checked by the
> IS DOCUMENT predicate, and by XMLPARSE and XMLSERIALIZE when they specify
> DOCUMENT. But it isn't checked or implied by the XMLDOCUMENT constructor.
>
> That amounts to a bit of unfortunate punning on the word DOCUMENT,
> but so help me that's what's in the standard.
Yeah, the term DOCUMENT seems a bit misleading in this context.
>
> It may help to think in terms of the hierarchy of XML types that the
> 2006 standard introduced (cribbed here from [3]):
>
> SEQUENCE
> |
> (?sequence of length 1, a document node)
> |
> CONTENT(ANY)----------------.----------------(?every element
> | | conforms to a
> (?every element has (?no extraneous schema)
> xdt:untyped and !nilled, nodes) |
> every attribute has | |
> xdt:untypedAtomic) DOCUMENT(ANY) CONTENT(XMLSCHEMA)
> | |
> CONTENT(UNTYPED) (?whole thing is valid
> | according to schema)
> (?no extraneous nodes) |
> | DOCUMENT(XMLSCHEMA)
> DOCUMENT(UNTYPED)
>
> where the condition (?no extraneous nodes) is shorthand for SQL/XML's
> more precise "whose `children` property has exactly one XQuery element
> node, zero or more XQuery comment nodes, and zero or more XQuery
> processing instruction nodes".
>
> So that (?no extraneous nodes) condition is required for any of
> the XML(DOCUMENT...) types. When you relax that condition, you have
> an XML(CONTENT...) type.
>
> The XMLDOCUMENT constructor is so named because it constructs what
> corresponds to an XQuery document node—which actually corresponds to
> the XML(CONTENT...) SQL/XML types, and does not enforce having a
> single root element:
>
> "This data model is more permissive: a Document Node may be empty,
> it may have more than one Element Node as a child, and it also
> permits Text Nodes as children."[4]
Thanks a lot for pointing that out! I guess it's clear now.
>
> So in terms of the SQL/XML type hierarchy, what you get back from
> XMLDOCUMENT ... RETURNING CONTENT will have one of the XML(CONTENT...)
> types (whether it's CONTENT(ANY) or CONTENT(UNTYPED) is left to the
> implementation).
>
> If you then want to know if it is single-rooted, you can apply the
> IS DOCUMENT predicate, or try to cast it to an XML(DOCUMENT...) type.
>
> (And if you use XMLDOCUMENT ... RETURNING SEQUENCE, then you get a
> value of type XML(SEQUENCE). The sequence has length 1, a document
> node, making it safely castable to XML(CONTENT(ANY)), but whether
> you can cast it to an XML(DOCUMENT...) type will depend on what
> children that document node has.)
>
> Long story short, an XMLDOCUMENT constructor that enforced having
> a single root element would be nonconformant.
>
If I understand correctly, the compliant approach would be to always
treat the input expression as CONTENT:
|PG_RETURN_XML_P(xmlparse((text *) data, XMLOPTION_DOCUMENT, true));|
Is that right?"
>
>> 1 - https://www.ibm.com/docs/en/db2/11.1?topic=constructors-document-node
>> 2 - https://www.postgresql.org/docs/17/xml-limits-conformance.html
> 3 -
> https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#SQL.2FXML:2003_contrasted_with_SQL.2FXML_since_2006
> 4 - https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode
>
Best, Jim
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-01-20 20:09:00 | Re: tzdata 2025a and timestamptz.out |
Previous Message | Bruce Momjian | 2025-01-20 19:48:53 | Re: attndims, typndims still not enforced, but make the value within a sane threshold |