Re: patch: function xmltable

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: function xmltable
Date: 2016-09-27 03:40:30
Message-ID: CAFj8pRDSADePLnc-EiT73i7k-RNdNrnupDiNqGNr=eaHnnpz0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2016-09-27 3:34 GMT+02:00 Craig Ringer <craig(at)2ndquadrant(dot)com>:

> On 24 September 2016 at 14:01, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
> >> Did some docs copy-editing and integrated some examples. Explained how
> >> nested elements work, that multiple top level elements is an error,
> >> etc. Explained the time-of-evaluation stuff. Pointed out that you can
> >> refer to prior output columns in PATH and DEFAULT, since that's weird
> >> and unusual compared to normal SQL. Documented handling of multiple
> >> node matches, including the surprising results of somepath/text() on
> >> <somepath>x<!--blah-->y</somepath>. Documented handling of nested
> >> elements. Documented that xmltable works only on XML documents, not
> >> fragments/forests.
> >
> >
> > I don't understand to this sentence: "It is possible for a PATH
> expression
> > to reference output columns that appear before it in the column-list, so
> > paths may be dynamically constructed based on other parts of the XML
> > document:"
>
>
>
> >> The docs and tests don't seem to cover XML entities. What's the
> >> behaviour there? Core XML only defines one entity, but if a schema
> >> defines more how are they processed? The tests need to cover the
> >> predefined entities &quot; &amp; &apos; &lt; and &gt; at least.
> >
> >
> > I don't understand, what you are propose here. ?? Please, can you send
> some
> > examples.
>
> Per below - handling of DTD <!ENTITY> declarations, and the builtin
> entity tests I already added tests for.
>
>
> >> It doesn't seem to cope with internal DTDs at all (libxml2 limitation?):
> >>
> >> SELECT * FROM xmltable('/' PASSING $XML$<?xml version="1.0"
> >> standalone="yes" ?>
> >> <!DOCTYPE foo [
> >> <!ELEMENT foo (#PCDATA)>
> >> <!ENTITY pg "PostgreSQL">
> >> ]>
> >> <foo>Hello &pg;.</foo>
> >> $XML$ COLUMNS foo text);
> >>
> >> + ERROR: invalid XML content
> >> + LINE 1: SELECT * FROM xmltable('/' PASSING $XML$<?xml version="1.0"
> ...
> >> + ^
> >> + DETAIL: line 2: StartTag: invalid element name
> >> + <!DOCTYPE foo [
> >> + ^
> >> + line 3: StartTag: invalid element name
> >> + <!ELEMENT foo (#PCDATA)>
> >> + ^
> >> + line 4: StartTag: invalid element name
> >> + <!ENTITY pg "PostgreSQL">
> >> + ^
> >> + line 6: Entity 'pg' not defined
> >> + <foo>Hello &pg;.</foo>
> >> + ^
> >>
> >
> > It is rejected before XMLTABLE function call
> >
> > postgres=# select $XML$<?xml version="1.0" standalone="yes" ?>
> > postgres$# <!DOCTYPE foo [
> > postgres$# <!ELEMENT foo (#PCDATA)>
> > postgres$# <!ENTITY pg "PostgreSQL">
> > postgres$# ]>
> > postgres$# <foo>Hello &pg;.</foo>
> > postgres$# $XML$::xml;
> > ERROR: invalid XML content
> > LINE 1: select $XML$<?xml version="1.0" standalone="yes" ?>
> > ^
> > DETAIL: line 2: StartTag: invalid element name
> > <!DOCTYPE foo [
> [snip]
>
> > It is disabled by default in libxml2. I found a function
> > xmlSubstituteEntitiesDefault http://www.xmlsoft.org/entities.html
> > http://www.xmlsoft.org/html/libxml-parser.html#
> xmlSubstituteEntitiesDefault
> >
> > The default behave should be common for all PostgreSQL's libxml2 based
> > function - and then it is different topic - maybe part for PostgreSQL
> ToDo?
> > But I don't remember any user requests related to this issue.
>
>
> OK, so it's not xmltable specific. Fine by me.
>
> Somebody who cares can deal with it. There's clearly nobody breaking
> down the walls wanting the feature.
>
> > I removed this tests - it is not related to XMLTABLE function, but to
> > generic XML processing/validation.
>
>
> Good plan.
>
> >> In +ExecEvalTableExpr, shouldn't you be using PG_ENSURE_ERROR_CLEANUP
> >> instead of a PG_TRY() / PG_CATCH() block?
> >
> >
> > If I understand to doc, the PG_ENSURE_ERROR_CLEANUP should be used, when
> you
> > want to catch FATAL errors (and when you want to clean shared memory).
> > XMLTABLE doesn't use shared memory, and doesn't need to catch fatal
> errors.
>
> Ok, makes sense.
>
>
> >> I don't have the libxml knowledge or remaining brain to usefully
> >> evaluate the xpath and xml specifics in xpath.c today. It does strike
> >> me that the new xpath parser should probably live in its own file,
> >> though.
> >
> > moved
>
> Thanks.
>
>
> > new version is attached
>
>
> Great.
>
> I'm marking this ready for committer at this point.
>

Thank you very much

Regards

Pavel

>
> I think the XML parser likely needs a more close reading, so I'll ping
> Peter E to see if he'll have a chance to check that bit out. But by
> and large I think the issues have been ironed out - in terms of
> functionality, structure and clarity I think it's looking solid.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-09-27 03:53:06 Re: patch: function xmltable
Previous Message Etsuro Fujita 2016-09-27 03:18:10 Re: Push down more full joins in postgres_fdw