Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Date: 2018-09-18 15:33:38
Message-ID: 25137.1537284818@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> [ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch. I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser. That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it. (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

* I think the business with "pgdefnamespace.pgsqlxml.internal" is just
plain awful. It's a wart, and I don't think it's even saving you any
code once you account for all the places where you have to throw errors
for somebody trying to use that as a regular name. This should be done
with out-of-band signaling if possible. The existing convention above
this code is that a NULL pointer means a default namespace; can't that
be continued throughout? If that's not practical, can you pick a string
that simply can't syntactically be a namespace name? (In the SQL world,
for example, an empty string is an illegal identifier so that that could
be used for the purpose. But I don't know if that applies to XML.)
Or perhaps you can build a random name that is chosen just to make it
different from any of the listed namespaces? If none of those work,
I think passing around an explicit "isdefault" boolean alongside the name
would be better than having a reserved word.

* _transformXPath recurses, so doesn't it need check_stack_depth()?

* I'm not especially in love with using function names that start
with an underscore. I do not think that adds anything, and it's
unlike the style in most places in PG.

* This is a completely unhelpful error message:
+ if (!parser->buffer_is_empty)
+ elog(ERROR, "internal error");
If you can't be bothered to write a message that says something
useful, either drop the test or turn it into an Assert. I see some
other internal errors that aren't up to project norms either.

* Either get rid of the "elog(DEBUG1)"s, or greatly lower their
message priority. They might've been appropriate for developing
this patch, but they are not okay to commit that way.

* Try to reduce the amount of random whitespace changes in the patch.

* .h files should never #include "postgres.h", per project policy.

* I'm not sure I'd bother with putting the new code into a separate
file rather than cramming it into xml.c. The main reason why not
is that you're going to get "empty translation unit" warnings from
some compilers in builds without USE_LIBXML.

* Documentation, comments, and error messages could all use some
copy-editing by a native English speaker (you knew that of course).

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-18 15:38:57 Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi
Previous Message Laurenz Albe 2018-09-18 15:28:53 Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi