Re: xmlconcat (was 9.0 release notes done)

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: xmlconcat (was 9.0 release notes done)
Date: 2010-03-24 18:51:09
Message-ID: 4BAA5F1D.8070308@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut wrote:
> On mån, 2010-03-22 at 19:38 -0400, Andrew Dunstan wrote:
>
>>> But if we are not comfortable about being able to do that safely, I
>>> would be OK with just raising an error if a concatenation is
>>>
>> attempted
>>
>>> where one value contains a DTD. The impact in practice should be
>>>
>> low.
>>
>>>
>>>
>> Right. Can you find a way to do that using the libxml API? I haven't
>> managed to, and I'm pretty sure I can construct XML that fails every
>> simple string search test I can think of, either with a false negative
>> or a false positive.
>>
>
> The documentation on that is terse as usual. In any case, you will need
> to XML parse the input values, and so you might as well resort to
> parsing the output value to see if it is well-formed, which should catch
> this mistake and possibly others.
>
>

Actually, I have come to the conclusion that the biggest problem in this
area is that we accept XML documents with a leading DOCTYPE node at all.
Our docs state:

The xml type can store well-formed "documents", as defined by the
XML standard, as well as "content" fragments, which are defined by
the production XMLDecl? content in the XML standard.

A document with a leading DOCTYPE node matches neither of these rules,
and when we strip the XMLDecl from a piece of XML where it's followed by
a DOCTYPE node we turn something that is legal XML into something that
isn't, even by our own (or possibly the standard's) relaxed definition.
A doctypedecl can only follow an XMLDecl, see
<http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-prolog-dtd>.

So I think we need to go back to the drawing board a bit, rather than
patch a particular reported error case. But these problems are not at
all new to 9.0, and coming up to beta as I hope we are is not the time
for it. I think it will have to wait to 9.1.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-03-24 19:15:12 Re: xmlconcat (was 9.0 release notes done)
Previous Message Steve Singer 2010-03-24 18:35:54 Re: dtester-0.1 released