Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Date: 2018-09-21 11:29:30
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers


út 18. 9. 2018 v 17:33 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > [ xml-xpath-default-ns-7.patch ]
> At Andrew's prompting, I took a look over this patch. I don't know much
> of anything about XML, so I have no idea as to standards compliance here,
> but I do have some comments:
> * I'm fairly uncomfortable with the idea that we're going to maintain
> our own XPath parser. That seems like a recipe for lots of future
> work ... and the code is far too underdocumented for anyone to actually
> maintain it. (Case in point: _transformXPath's arguments are not
> documented at all, and in fact there's hardly a word about what it's
> even supposed to *do*.)

I understand, and I would be much more happy if xmllib2 supports this
feature. But the development of new feature of this lib was frozen - and
there are not any possibility.

On second hand the parser is very simple - tokenizer detects identifiers,
strings, numbers, and parser try to find unqualified names and prints
default namespace identifier before. It doesn't do more.

I renamed function _transformXPath to transform_xpath_recurse and I
descibed params

> * I think the business with "pgdefnamespace.pgsqlxml.internal" is just
> plain awful. It's a wart, and I don't think it's even saving you any
> code once you account for all the places where you have to throw errors
> for somebody trying to use that as a regular name. This should be done
> with out-of-band signaling if possible. The existing convention above
> this code is that a NULL pointer means a default namespace; can't that
> be continued throughout? If that's not practical, can you pick a string
> that simply can't syntactically be a namespace name? (In the SQL world,
> for example, an empty string is an illegal identifier so that that could
> be used for the purpose. But I don't know if that applies to XML.)
> Or perhaps you can build a random name that is chosen just to make it
> different from any of the listed namespaces? If none of those work,
> I think passing around an explicit "isdefault" boolean alongside the name
> would be better than having a reserved word.

The string used like default namespace name should be valid XML namespace
name, because it is injected to XPath expression and it is passed to
libxml2 as one namespace name. libxml2 requires not null valid value.

I changed it and now the namespace name is generated.

> * _transformXPath recurses, so doesn't it need check_stack_depth()?


> * I'm not especially in love with using function names that start
> with an underscore. I do not think that adds anything, and it's
> unlike the style in most places in PG.


> * This is a completely unhelpful error message:
> + if (!parser->buffer_is_empty)
> + elog(ERROR, "internal error");
> If you can't be bothered to write a message that says something
> useful, either drop the test or turn it into an Assert. I see some
> other internal errors that aren't up to project norms either.

use assert

> * Either get rid of the "elog(DEBUG1)"s, or greatly lower their
> message priority. They might've been appropriate for developing
> this patch, but they are not okay to commit that way.


> * Try to reduce the amount of random whitespace changes in the patch.

The formatting was really strange, fixed

> * .h files should never #include "postgres.h", per project policy.

I moved the code to xml.c like you propose

> * I'm not sure I'd bother with putting the new code into a separate
> file rather than cramming it into xml.c. The main reason why not
> is that you're going to get "empty translation unit" warnings from
> some compilers in builds without USE_LIBXML.
> * Documentation, comments, and error messages could all use some
> copy-editing by a native English speaker (you knew that of course).

I hope so some native speakers looks there.

Thank you for comments

Attached updated patch



> regards, tom lane

Attachment Content-Type Size
default_namespace-20180921.patch.gz application/gzip 9.3 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-09-21 11:31:57 Re: proposal: prefix function
Previous Message Etsuro Fujita 2018-09-21 11:03:01 Re: Problem while updating a foreign table pointing to a partitioned table on foreign server