Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pavel(dot)stehule(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Date: 2017-10-03 06:16:49
Message-ID: 20171003.151649.44255529.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 2 Oct 2017 12:43:19 +0200, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote in <CAFj8pRCD8=AzbRJQ2_2rp3+uzade9GHbm0DAF3-t__yVtzo2cA(at)mail(dot)gmail(dot)com>
> > Sorry, I forgot to care about that. (And the definition of
> > namespace array is of course fabricated by me). I'd like to leave
> > this to committers. Anyway it is working but the syntax (or
> > whether it is acceptable) is still arguable.
> >
> > SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com">
> > test</my:a>',
> > ARRAY[ARRAY['', 'http://example.com']]);
> > | xpath
> > | --------
> > | {test}
> > | (1 row)
> >
> >
> > The internal name is properly rejected, but the current internal
> > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
> > are preserving some short names and reject them as
> > user-defined. Doesn't just 'pgsqlxml' work?
> >
>
> LibXML2 does trim to 100 bytes length names. So
> pgdefnamespace.pgsqlxml.internal
> is safe from this perspective.
>
> I would to decraese a risk of possible collision, so longer string is
> better. Maybe "pgsqlxml.internal" is good enoug - I have not a idea. But if
> somewhere will be this string printed, then
> "pgdefnamespace.pgsqlxml.internal" has clean semantic, and it is reason,
> why I prefer this string. PostgreSQL uses 63 bytes names - and this string
> is correct too.

Ok, I'm fine with that.

> > Default namespace correctly become to be applied on bare
> > attribute names.
> >
> > > updated doc,
> > > fixed all variants of expected result test file
> >
> > Sorry for one by one comment but I found another misbehavior.
> >
> > create table t1 (id int, doc xml);
> > insert into t1
> > values
> > (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></
> > rows>');
> > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x),
> > '/x:rows/x:row' passing t1.doc columns data int PATH
> > 'child::x:a[1][attribute::hoge="haha"]') as x;
> > | data
> > | ------
> > | 50
> >
> > but the following fails.
> >
> > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
> > '/rows/row' passing t1.doc columns data int PATH
> > 'child::a[1][attribute::hoge="haha"]') as x;
> > | data
> > | ------
> > |
> > | (1 row)
> >
> > Perhaps child::a is not prefixed by the transformation.
> >
> > XPath might be complex enough so that it's worth switching to
> > yacc/lex based transformer that is formally verifiable and won't
> > need a bunch of cryptic tests that finally cannot prove the
> > completeness. synchronous_standy_names is far simpler than XPath
> > but using yacc/lex parser.
> >
>
> I don't think (not yet) - it is simple state machine now, and when the code
> will be stable, then will not be modified.

Hmm. Ok, agreed. I didn't mean the current shape ought to be
changed.

> Thank you for comments, I'll look on it

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-10-03 06:55:02 Re: pnstrdup considered armed and dangerous
Previous Message Masahiko Sawada 2017-10-03 06:16:30 Re: cache lookup errors for missing replication origins