contrib/xml2 vs core xml, more issues

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: contrib/xml2 vs core xml, more issues
Date: 2010-03-01 01:02:51
Message-ID: 18795.1267405371@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some other things struck me on further code-reading. These are not
problems if core xml functions and contrib/xml2 functions are used
independently, but they would be issues if one is called within the
other --- and both modules have functions that invoke arbitrary SQL
inside their execution, so that's quite possible.

1. contrib/xml2 calls xmlCleanupParser() all over the place. This cuts
any outer usage of the libxml parser off at the knees. I think we
should just remove these calls, and hope that Perl and other possible
users of libxml2 are not dumb enough to call it either. The libxml2
documentation deprecates using this function at all, except possibly
just before exit() if you want to avoid valgrind memory leak warnings.

2. contrib/xml2 calls xmlSetGenericErrorFunc with its own error handler,
possibly replacing the core's as the active handler, or vice versa.
This wouldn't be fatal except that xml2's error handler will remain
active after it returns, and that handler is very wobbly for contexts
other than the one it's designed to be called in --- in particular it
supposes that its static errbuf string is still valid, even though that
string is allocated in a short-lived memory context. It's not too hard
to envision code paths in which this results in a core dump.

What I'm inclined to do to fix #2 is export the core's xml_init() and
xml_ereport() functions so that contrib/xml2 can make use of the core's
much-more-robust libxml error handler. This fixes the problem as far
as those two modules are concerned. If Perl or some other third-party
module also calls xmlSetGenericErrorFunc, what would happen is that
potentially our error handler would eat libxml messages intended for the
other or vice versa. As long as the error handlers are robust and won't
crash when called in unexpected contexts, the worst possible consequence
is not seeing some useful libxml detail about an error, so I think this
is tolerable.

(The more I learn about it, the less impressed I am with the design of
libxml2's API, but I suppose we have no control over that.)

Comments?

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-03-01 02:27:53 Could we do pgindent on just utils/adt/xml.c in the 8.3 branch?
Previous Message David Fetter 2010-03-01 00:08:28 Re: contrib/xml2 vs core xml in 8.3