On 04/27/2011 05:30 PM, Noah Misch wrote:
>> I'm not sure what to do about the back branches and cases where data is
>> already in databases. This is fairly ugly. Suggestions welcome.
> We could provide a script in (or linked from) the release notes for testing the
> data in all your xml columns.
Yeah, we'll have to do something like that. What a blasted mess,
> To make things worse, the dump/reload problems seems to depend on your version
> of libxml2, or something. With git master, a CentOS 5 system with
> 2.6.26-188.8.131.52.el5_5.1 accepts the ^A byte, but an Ubuntu 8.04 LTS system with
> 2.6.31.dfsg-2ubuntu rejects it. Even with a patch like this, systems with a
> lenient libxml2 will be liable to store XML data that won't restore on a system
> with a strict libxml2. Perhaps we should emit a build-time warning if the local
> libxml2 is lenient?
No, I think we need to be strict ourselves.
>> + if (*p< '\x20')
> This needs to be an unsigned comparison. On my system, "char" is signed, so
> "SELECT xmlelement(name foo, null, E'\u0550')" fails incorrectly.
Good point. Perhaps we'd be better off using iscntrl(*p).
> The XML character set forbids more than just control characters; see
> http://www.w3.org/TR/xml/#charsets. We also ought to reject, for example,
> "SELECT xmlelement(name foo, null, E'\ufffe')".
> Injecting the check here aids "xmlelement" and "xmlforest" , but "xmlcomment"
> and "xmlpi" still let the invalid byte through. You can also still inject the
> byte into an attribute value via "xmlelement". I wonder if it wouldn't make
> more sense to just pass any XML that we generate from scratch through libxml2.
> There are a lot of holes to plug, otherwise.
Maybe there are, but I'd want lots of convincing that we should do that
at this stage. Maybe for 9.2. I think we can plug the holes fairly
simply for xmlpi and xmlcomment, and catch the attributes by moving this
check up into map_sql_value_to_xml_value().
This is a significant data integrity bug, much along the same lines as
the invalidly encoded data holes we plugged a release or two back. I'm
amazed we haven't hit it till now, but we're sure to see more of it -
XML use with Postgres is growing substantially, I believe.
In response to
pgsql-hackers by date
|Next:||From: HSIEN-WEN CHU||Date: 2011-04-28 03:33:44|
|Subject: VX_CONCURRENT flag on vxfs( 5.1 or later) for performance for postgresql?|
|Previous:||From: Vlad Arkhipov||Date: 2011-04-28 03:07:34|
|Subject: Re: Predicate locking|