Re: Review of patch Bugfix for XPATH() if expression returns a scalar value

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Radosław Smogura <rsmogura(at)softperience(dot)eu>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review of patch Bugfix for XPATH() if expression returns a scalar value
Date: 2011-06-30 10:04:39
Message-ID: 316D7366-0194-4836-AE6A-D99ACBAE9996@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun30, 2011, at 10:33 , Radosław Smogura wrote:
> You may manually enter invalid xml too, You don't need xpath for this.

Please give an example, instead of merely asserting
I get

postgres=# select '<'::xml;
ERROR: invalid XML content at character 8

> Much more
> In PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 64-bit
> I executed
> SELECT XMLELEMENT(name "a", XMLATTRIBUTES('<node/>'::xml as v))
> and it's looks like I got correct output "<a v="&lt;node/&gt;"/>"
> when I looked with text editor into table files I saw same value.

I fail to see your point. XMLATTRIBUTES currently escapes
the value unconditionally, which happens to work as expected in
this case. But try
SELECT XMLELEMENT(name "a", XMLATTRIBUTES('&lt;'::xml as v))
which returns
<a v="&amp;lt;"/>
which I guess isn't what one would expect, otherwise one wouldn't
have added the cast to xml.

I believe the latter example should probably return
<a v="&lt;"/>
instead, just as XMLELEMENT(name "a", '&lt;'::xml) returns
<a>&lt;</a>

But again, this has little to do with the patch at hand, and should
therefore be discussed separately. The fact that XPATH's failure to
escape it's return value happens to fit XMLATTRIBUTE's failure to
not escape it's input value doesn't prove that both functions are
correct. Especially not if other functions like XMLELEMENT *don't*
escape their input values of they're already of type XML.

> But indeed, now PostgreSQL is not type safe against XML type. See
> SELECT XMLELEMENT(name "root", '<', (xpath('//text()', '<root>&lt;</root>'))[1])).

That is *exactly* what my other patch fixes, the one you deem
undesirable. With that other patch applied, I get

xmlelement
-----------------------
<root>&lt;&lt;</root>

wich is correct. So here too I fail to see your point.

> You found some problem with escaping, but solution is bad,
> I think problem lies with XML type, which may be used to hold strings
> and proper xml.

But it's not *supposed* to hold arbitrary strings. If it was, the
input function wouldn't check whether or not the value was valid or
not.

Do you agree that, for any type, "SELECT value_of_type_X::text::X"
should never throw an error? Because that, IMHO quite basic, guarantee
is broken without proper escaping of XML values. Try

SELECT (XPATH('/text()', '<t>&lt;</t>'::xml))[1]::text::xml

on unpatched 9.0. It fails, which IMNSHO is very clearly a bug.

> For example above query can't distinguish if
> (xpath('//text()', '<root>&lt;</root>'))[1] is xml text, or xml element.

Text *is* a kind of element in XML. There are different types of
nodes, one of which are is "text node". The fact that it's not surrounded
by <..>, </..> doesn't make a different, and doesn't change the quoting
rules one bit.

> For me adding support for scalar is really good and needed, but escaping is not.

So far, I remain unconvinced by your arguments. What you argue for
might be sensible if we *didn't* have an XML type and instead simply
had stuff like XPATH() for type TEXT. But we *do* have a type XML,
which *does* validate it's input, so I fail to see how allowing values
of type XML which *don't* pass that input validation can be anything but
a bug.

Compare this to the textual types. We've gone through some pain in
the past to guarantee that textual types never hold values which aren't
valid text in the database's encoding (e.g, we prevent textual values
from containing bytes sequences which aren't valid UTF-8 if the database
encoding is UTF-8). It follows that we should do the same for XML types.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2011-06-30 10:28:49 Re: Range Types, constructors, and the type system
Previous Message 花田 茂 2011-06-30 10:00:23 Re: Bug in SQL/MED?