Re: patch: function xmltable

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: function xmltable
Date: 2017-01-31 10:26:32
Message-ID: CAFj8pRD88BHuDyRf5AhS5OudOvKuwiANwYJj_kBORN9=d9zfsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2017-01-24 21:38 GMT+01:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> Pavel Stehule wrote:
>
> > * SELECT (xmltable(..)).* + regress tests
> > * compilation and regress tests without --with-libxml
>
> Thanks. I just realized that this is doing more work than necessary --
> I think it would be simpler to have tableexpr fill a tuplestore with the
> results, instead of just expecting function execution to apply
> ExecEvalExpr over and over to obtain the results. So evaluating a
> tableexpr returns just the tuplestore, which function evaluation can
> return as-is. That code doesn't use the value-per-call interface
> anyway.
>
> I also realized that the expr context callback is not called if there's
> an error, which leaves us without shutting down libxml properly. I
> added PG_TRY around the fetchrow calls, but I'm not sure that's correct
> either, because there could be an error raised in other parts of the
> code, after we've already emitted a few rows (for example out of
> memory). I think the right way is to have PG_TRY around the execution
> of the whole thing rather than just row at a time; and the tuplestore
> mechanism helps us with that.
>
> I think it would be good to have a more complex test case in regress --
> let's say there is a table with some simple XML values, then we use
> XMLFOREST (or maybe one of the table_to_xml functions) to generate a
> large document, and then XMLTABLE uses that document as input document.
>

I have a 16K lines long real XML 6.MB. Probably we would not to append it
to regress tests.

It is really fast - original customer implementation 20min, nested our
xpath implementation 10 sec, PLPython xml reader 5 sec, xmltable 400ms

I have a plan to create tests based on pg_proc and CTE - if all works, then
the query must be empty

with x as (select proname, proowner, procost, pronargs,
array_to_string(proargnames,',') as proargnames,
array_to_string(proargtypes,',') as proargtypes from pg_proc), y as (select
xmlelement(name proc, xmlforest(proname, proowner, procost, pronargs,
proargnames, proargtypes)) as proc from x), z as (select xmltable.* from y,
lateral xmltable('/proc' passing proc columns proname name, proowner oid,
procost float, pronargs int, proargnames text, proargtypes text)) select *
from z except select * from x;

>
> Please fix.
>
> --
> Álvaro Herrera https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-01-31 10:28:17 Re: IF (NOT) EXISTS in psql-completion
Previous Message Etsuro Fujita 2017-01-31 10:25:15 Re: An issue in remote query optimization