Skip site navigation (1) Skip section navigation (2)

Patch Review: Bugfix for XPATH() if text or attribute nodes are selected

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Patch Review: Bugfix for XPATH() if text or attribute nodes are selected
Date: 2011-06-29 17:34:23
Message-ID: 201106291934.23089.rsmogura@softperience.eu (view raw or flat)
Thread:
Lists: pgsql-hackers
Review of patch 
https://commitfest.postgresql.org/action/patch_view?id=580

=== Patch description ===
SUMMARY: When text() based XPATH expression is invoked output is not XML 
escaped

DESCRIPTION: Submitter invokes following statement:
SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1].
He expect (escaped) result "&lt;", but gets "<"

AFFECTS: Possible this may affects situations when user wants to insert output 
from above expression to XML column.

PATCH CONTENT:
A. 1. Patch fixes above problem (I don't treat this like problem, but like 
enhancement).
A. 2. In addition patch contains test cases for above.
A. 3. Patch changes behaviour of method xml_xmlnodetoxmltype invoked by 
xpath_internal, by adding escape_xml() call.
A. 4. I don't see any stability issues with this.
A. 5. Performance may be reduced and memory consumption may 
increase due to internals of method escape_xml

=== Review ===
B. 1. Patch applies cleanly.

B. 2. Source compiles, and patch works as Submitter wants.

B. 3. Personally I missed some notes in documentation that such 
expression will be escaped (those should be clearly exposed, as the "live" 
functionality is changed).

B. 4. Submitter, possible, revealed some missed, minor functionality of 
PostgreSQL XML. As he expects XML escaped output.

B. 5. Currently XPATH produces escaped output for Element nodes, and 
not escaped output for all other types of nodes (including text, 
comment, etc.)

B. 6. Current behaviour _is intended_ (there is "if"  to check node type) 
and _"natural"_. In this particular case user ask for text content of some 
node, and this content is actually "<".

B. 7. Similar behaviour may be observer e. g. in SQL Server(tm)
SELECT x.value('(//text())[1]', 'varchar(256)') FROM #xml_t;
Produced "<"

B. 8. Even if current behaviour may be treated as wrong it was exposed 
and other may depends on not escaped content.

B. 9. I think, correct approach should go to create new function 
(basing on one existing) that will be able to escape above. In this 
situation
call should look like (for example only!):
SELECT xml_escape((XPATH('/*/text()', '<root>&lt;</root>')))[1]
or
SELECT xml_escape((XPATH('/*/text()', '<root>&lt;</root>'))[1])
One method to operate on array one to operate on single XML datum.
Or to think about something like xmltext(). (Compare current status of xml.c)

B. 10. If such function (B.9.) is needed and if it will be included is out of 
scope of this review.

Basing mainly on A.1, B.6., B.8., bearing in mind B.10., in my opinion this is 
subject to reject as "need more work", "or as invalid".

The detailed explanation why such behaviour should not be implemented I will 
send in review of https://commitfest.postgresql.org/action/patch_view?id=565.

Regards,

Radosław Smogura

P. S.
I would like to say sorry, for such late answaer, but I sent this from other 
mail address, which was not attached to mailing list. Blame KDE KMail not me 
:)

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2011-06-29 17:42:34
Subject: Re: default privileges wording
Previous:From: Hitoshi HaradaDate: 2011-06-29 17:22:07
Subject: Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group