Re: Fix XML handling with DOCTYPE

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ryan Lambert <ryan(at)rustprooflabs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix XML handling with DOCTYPE
Date: 2019-03-16 20:42:45
Message-ID: 22865.1552768965@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ryan Lambert <ryan(at)rustprooflabs(dot)com> writes:
> I'm investigating the issue I reported here:
> https://www.postgresql.org/message-id/flat/153478795159.1302.9617586466368699403%40wrigleys.postgresql.org
> I'd like to work on a patch to address this issue and make it work as
> advertised.

Good idea, because it doesn't seem like anybody else cares ...

> I see xmlParseBalancedChunkMemoryRecover that might provide the
> functionality needed.

TBH, our experience with libxml has not been so positive that I'd think
adding dependencies on new parts of its API would be a good plan.

Experimenting with different inputs, it seems like removing the
"<!DOCTYPE ...>" tag is enough to make it work. So what I'm wondering
about is writing something like parse_xml_decl() to skip over that.

Bear in mind though that I know next to zip about XML. There may be
some good reason why we don't want to strip off the !DOCTYPE part
from what libxml sees.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-03-16 20:47:06 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Previous Message Heikki Linnakangas 2019-03-16 20:33:29 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons