Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)
Date: 2018-09-16 18:05:27
Message-ID: 14106.1537121127@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> writes:
>> The reason is: parse tree node for XMLNAMESPACES clause has null pointer
>> in the case of DEFAULT namespace (the pointer will be initialized at
>> executor on the first call).

> My immediate reaction is that somebody made a bad decision about how
> to represent XMLNAMESPACES clauses. It's not quite too late to fix
> that; but could you offer a concrete example that triggers this,
> so it's clear what case we're talking about?

I'd thought you were talking about the raw-parsetree representation,
but after poking at it, I see that the issue is with the post-parse-
analysis representation. That makes this not just a minor impediment
to debugging, but an easily reachable server crash, for example

regression=# CREATE VIEW bogus AS
SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row'
PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>' COLUMNS a int PATH 'a');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Aside from stored rules not working, we'd also have a problem with
passing such parsetrees to parallel workers (though I'm not sure
whether RangeTableFunc is considered parallelizable). And you can
make it crash by turning on debug_print_parse, too.

The reason why we've not heard this reported is doubtless that
DEFAULT isn't really supported: if you get as far as execution,
you get

regression=# SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row'
PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
COLUMNS a int PATH 'a');
ERROR: DEFAULT namespace is not supported

So I think that (a) we ought to fix the parsetree representation,
perhaps as attached, and (b) we ought to backpatch into v10 where this
syntax was introduced. Normally, this would be considered a change
of stored rules, forcing a catversion bump and hence non-backpatchable.
However, since existing releases would crash trying to store a rule
containing this construct, there can't be any stored rules out there
containing it, making the incompatibility moot.

The change the attached patch makes is to represent a DEFAULT namespace
as a NULL list entry, rather than a T_String Value node containing a
null. This approach does work for all backend/nodes/ operations, but
it could be argued that it's still a crash hazard for unsuspecting code.
We could do something else, like use a T_Null Value to represent DEFAULT,
but I'm doubtful that that's really much better. A NULL entry has the
advantage (or at least I'd consider it one) of being a certain crash
rather than a probabilistic crash for any uncorrected code accessing
the list item. Thoughts?

regards, tom lane

Attachment Content-Type Size
fix-default-namespace-representation-1.patch text/x-diff 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Douglas Doole 2018-09-16 18:12:51 Re: Collation versioning
Previous Message Dmitri Maziuk 2018-09-16 16:28:39 Re: Code of Conduct plan