Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pavel(dot)stehule(at)gmail(dot)com
Cc: thomas(dot)munro(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Date: 2018-01-23 07:13:00
Message-ID: 20180123.161300.105501004.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I returned to this.

I thouroughly checked the translator's behavior against the XPath
specifications and checkd out the documentation and regression
test. Almost everything is fine for me and this would be the last
comment from me.

At Fri, 24 Nov 2017 18:32:43 +0100, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote in <CAFj8pRB7Fs_2DrtUTGhTmQb+KReXPH6SG62hGWO3KVL_eZYCaA(at)mail(dot)gmail(dot)com>
> 2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>
> >
> >
> > 2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> >
> >> Hi
> >>
> >> 2017-11-22 22:49 GMT+01:00 Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>:
> >>
> >>> On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> >>> wrote:
> >>> > Attached new version.
> >>>
> >>> Hi Pavel,
> >>>
> >>> FYI my patch testing robot says[1]:
> >>>
> >>> xml ... FAILED
> >>>
> >>> regression.diffs says:
> >>>
> >>> + SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
> >>> '/rows/row' PASSING t1.doc COLUMNS data int PATH
> >>> 'child::a[1][attribute::hoge="haha"]') as x;
> >>> + data
> >>> + ------
> >>> + (0 rows)
> >>> +
> >>>
> >>> Maybe you forgot to git-add the expected file?
> >>>
> >>> [1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133
> >>
> >>
> >> unfortunately xml.out has 3 versions and is possible so one version
> >> should be taken elsewhere than my comp.
> >>
> >> please can me send your result xml.out file?
> >>
> >
> > looks like this case is without xml support so I can fix on my comp.
> >
> >
> fixed regress test

(I wouldn't have found that..)

I have three comments on the behavior and one on documentation.

1. Lack of syntax handling.

["'" [^'] "'"] is also a string literal, but getXPathToken is
forgetting that and applying default namespace mistakenly to the
astring content.

2. Additional comment might be good.

It might be better having additional description about default
namespace in the comment starts from "Namespace mappings are
passed as text[]" in xpth_internal().

3. Inconsistent behavior from named namespace.

| - function context, aliases are <emphasis>local</emphasis>).
| + function context, aliases are <emphasis>local</emphasis>). Default namespace has
| + empty name (empty string) and should be only one.

This works as the description, on the other hand the same
namespace prefix can be defined twice or more in the array and
the last one is in effect. I don't see a reason for
differenciating the default namespace case.

4. Comments on the documentation part.

# Even though I'm not sutable for commenting on wording...

| + Inside predicate literals <literal>and</literal>, <literal>or</literal>,
| + <literal>div</literal> and <literal>mod</literal> are used as keywords
| + (XPath operators) every time and default namespace are not applied there.

*I*'d like to have a comma between the predicate and literals,
and have a 'a' before prediate. Or 'Literals .. inside a
predicate' might be better?

'are used as keywords' might be better being 'are identifed as
keywords'?

Default namespace is applied to tag names except the listed
keywords even inside a predicate. So 'are not applied there'
might be better being 'are not applied to them'? Or 'are not
applied in the case'?

| + If you would to use these literals like tag names, then the default namespace
| + should not be used, and these literals should be explicitly
| + labeled.
| + </para>

Default namespace is not applied *only to* such keywords inside a
predicate. Even if an Xpath expression contains such a tag name,
default namespace still works for other tags. Does the following
make sense?

+ Use named namespace to qualify such tag names appear in an
+ XPath predicate.

===
After the aboves are addressed (even or rejected), I think I
don't have no additional comment.

- This current patch applies safely (with small shifts) on the
current master.

- The code looks fine for me.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax is existing but just not implemented so I don't
think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions.

(xpath specifications is here: https://www.w3.org/TR/1999/REC-xpath-19991116/)

Related unused features (and not documented?):
context variables ($n notations),
user-defined functions (or function names prefixed by a namespace prefix)

Newly documented behavior:
the default namespace isn't applied to and/or/div/mod.

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but it's not
viable. I am fine with the basic test cases added by the
current patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2018-01-23 07:23:35 Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum
Previous Message Ashutosh Sharma 2018-01-23 06:55:31 Re: Query related to alter table ... attach partition