Skip site navigation (1) Skip section navigation (2)

Re: Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mike Fowler <mike(at)mlfowler(dot)com>
Cc: Mike Rylander <mrylander(at)gmail(dot)com>, Mike Berrow <mberrow(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Date: 2010-07-01 22:41:31
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Thu, Jul 1, 2010 at 12:25 PM, Mike Fowler <mike(at)mlfowler(dot)com> wrote:
> Quoting Mike Fowler <mike(at)mlfowler(dot)com>:
>> Should the IS DOCUMENT predicate support this? At the moment you get
>> the following:
>> template1=# SELECT
>> '<towns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns>'
>>  IS
>> ?column?
>> ----------
>> t
>> (1 row)
>> template1=# SELECT
>> '<towns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns'
>>  IS
>> ERROR:  invalid XML content
>> LINE 1: SELECT '<towns><town>Bidford-on-Avon</town><town>Cwmbran</to...
>>              ^
>> DETAIL:  Entity: line 1: parser error : expected '>'
>> owns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns
>>      ^
>> Entity: line 1: parser error : chunk is not well balanced
>> owns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns
>>      ^
>> I would've hoped the second would've returned 'f' rather than failing.
>> I've had a glance at the XML/SQL standard and I don't see anything in
>> the detail of the predicate (8.2) that would specifically prohibit us
>> from changing this behavior, unless the common rule  'Parsing a string
>> as an XML value' (10.16) must always be in force. I'm no standard
>> expert, but IMHO this would be an acceptable change to improve
>> usability. What do others think?
> Right, I've answered my own question whilst sitting in the open source
> coding session at CHAR(10). Yes, IS DOCUMENT should return false for a
> non-well formed document, and indeed is coded to do such. However, the
> conversion to the xml type which happens before the underlying
> xml_is_document function is even called fails and exceptions out. I'll work
> on a patch to resolve this behavior such that IS DOCUMENT will give you the
> missing 'xml_is_well_formed' function.

I think the point if "IS DOCUMENT" is to distinguish a document:

<foo>some stuff<bar/><baz/></foo>

from a document fragment:


A document is allowed only one toplevel tag.

It'd be nice, I think, to have a function that tells you whether
something is legal XML without throwing an error if it isn't, but I
suspect that should be a separate function, rather than trying to jam
it into "IS DOCUMENT".

Robert Haas
The Enterprise Postgres Company

In response to


pgsql-hackers by date

Next:From: uwcssaDate: 2010-07-02 01:20:25
Subject: hello
Previous:From: Guillaume LelargeDate: 2010-07-01 22:31:30
Subject: Re: Cannot cancel the change of a tablespace

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group