Re: extract text from XML

From: Tobias Bussmann <t(dot)bussmann(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: chris(at)pacejo(dot)net
Subject: Re: extract text from XML
Date: 2016-08-11 14:43:58
Message-ID: 74B01E67-14A6-46F1-8812-E8E3E58AD861@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I have found a basic use case which is supported by the xml2 module,
> but is unsupported by the new XML API.
> It is not possible to correctly extract text

Indeed. I came accross this shortcomming some months ago myself but still manage an item on my ToDo list to report it here as the deprecation notice at https://www.postgresql.org/docs/devel/static/xml2.html#AEN180625 asks for. Done, thanks ;)

I did some archive-browsing on that topic. The issue (if you want to call it that way) was introduced by an patch to ensure xpath() always returns xml, applied for 9.2 after some discussion: https://www.postgresql.org/message-id/201106291934.23089.rsmogura%40softperience.eu and is since then known: https://www.postgresql.org/message-id/1409795403248-5817667.post%40n5.nabble.com The new behaviour was later reported as a bug and discussed again: https://www.postgresql.org/message-id/CAAY5AM1L83y79rtOZAUJioREO6n4%3DXAFKcGu6qO3hCZE1yJytg%40mail.gmail.com

Anyhow - (un)escaping functions to support the text<->xml conversion are often talked about but still seem only to be found in xml2 module. Seeing a xmltable implementing patch here recently, these functions would be another step to make the contrib module obsolete, finally.

> Perhaps a function xpath_value(text, xml) -> text[] would close the gap?

such an design, resembling the xml2 behaviour, would certainly fit the need, imho.

regards
Tobias

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2016-08-11 15:27:59 Re: 9.6 phrase search distance specification
Previous Message Petr Jelinek 2016-08-11 14:43:03 Re: Logical Replication WIP