Re: PostgreSQL vs SQL/XML Standards

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Markus Winand <markus(dot)winand(at)winand(dot)at>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: PostgreSQL vs SQL/XML Standards
Date: 2019-01-26 04:03:11
Message-ID: 5C4BDBFF.6040905@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/25/19 19:37, Chapman Flack wrote:
> There is still nothing in this patch set to address [1], though that
> also seems worth doing, perhaps in another patch, and probably not
> difficult, perhaps needing only a regex.

Heck, it could be even simpler than that. If some XML input has a DTD,
the attempt to parse it as 'content' with libxml is sure to fail early
in the parse (because that's where the DTD is found). If libxml reports
enough error detail to show it failed because of a DTD—and it appears to:

DETAIL: line 1: StartTag: invalid element name
<!DOCTYPE foo>

—then simply recognize that error and reparse as 'document' on the spot.
The early-failing first parse won't have cost much, and there is probably
next to nothing to gain by trying to be any more clever.

The one complication might that there seem to be versions of libxml that
report error detail differently (hence the variant regression test expected
files), so the code might have to recognize a couple different forms.

-Chap

> [1] https://www.postgresql.org/message-id/5BD1C44B.6040300%40anastigmatix.net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2019-01-26 04:07:49 Re: PostgreSQL vs SQL/XML Standards
Previous Message Pavel Stehule 2019-01-26 03:53:27 Re: PostgreSQL vs SQL/XML Standards