xpath does not seem to escape HTML correctly

From: John Lamb <john(dot)lamb(dot)meng(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: xpath does not seem to escape HTML correctly
Date: 2014-09-03 23:47:12
Message-ID: CALr6pkhSe20gh5Hci1H=uT_7QE4av0m9h2eQMjqUX6D6AD9H1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I am parsing XML into a table using xpath and having issues with regards to
the HTML encoded characters, it seems that some escaped HTML codes are
being converted but others are not.

Prior to injection into XML the raw data is : !"£$%^&*()<>

The raw XML is as follows : <PLAN>!&quot;&#163;$%^&amp;*()&lt;&gt;</PLAN>

When I run this through xpath as suggested in the post below I find that
some codes are converted and some aren't

select (xpath('/PLAN/text()',
('<PLAN>!&quot;&#163;$%^&amp;*()&lt;&gt;</PLAN>')::xml))[1]::text

The result is this : !"£$%^&amp;*()&lt;&gt;

It seems the &, < and > chars are not being handled but quote and GBP
symbols are OK, when I run the query in the post below the "magic" and
"toaster" works but I still get "s&amp;witch"

http://www.postgresql.org/message-id/jm6hla$bld$1@reversiblemaps.ath.cx

I am using PostgreSQL 9.3 on windows but see this behaivour on 9.1 on linux.

Does anyone have any ideas what the problem is here or do you think it's
some kind of bug ?

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message David G Johnston 2014-09-04 01:50:03 Re: xpath does not seem to escape HTML correctly
Previous Message agharta 2014-08-27 09:04:33 Re: Retrieve most recent 1 record from joined table