Re: Issue: Deprecation of the XML2 module 'xml_is_well_formed' function

From: Mike Fowler <mike(at)mlfowler(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Mike Rylander <mrylander(at)gmail(dot)com>, Mike Berrow <mberrow(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issue: Deprecation of the XML2 module 'xml_is_well_formed' function
Date: 2010-07-02 13:07:15
Message-ID: 20100702140715.7ro57sh9ck480wcs@www.mlfowler.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Quoting Robert Haas <robertmhaas(at)gmail(dot)com>:

>
> I think the point if "IS DOCUMENT" is to distinguish a document:
>
> <foo>some stuff<bar/><baz/></foo>
>
> from a document fragment:
>
> <bar/><baz/>
>
> A document is allowed only one toplevel tag.
>
> It'd be nice, I think, to have a function that tells you whether
> something is legal XML without throwing an error if it isn't, but I
> suspect that should be a separate function, rather than trying to jam
> it into "IS DOCUMENT".
>
> http://developer.postgresql.org/pgdocs/postgres/functions-xml.html#AEN15187
>

I've submitted a patch to the bug report I filed yesterday that
implements this. The way I read the standard (and I'm only reading a
draft and I'm no expert) I don't see that it mandates that IS DOCUMENT
returns false when IS CONTENT would return true. So if IS CONTENT were
to be implemented, to determine that you have something that is
malformed you could say:

val IS NOT DOCUMENT AND val IS NOT CONTENT

I think having the direct predicate support would be useful for
columns of text where you know that some, though possibly not all,
text values are valid XML.

--
Mike Fowler
Registered Linux user: 379787

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-07-02 13:13:37 Re: pgsql: Allow copydir() to be interrupted.
Previous Message Teodor Sigaev 2010-07-02 12:33:31 gincostestimate