From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <amborodin86(at)gmail(dot)com> |
Subject: | Re: [PATCH] Add pretty-printed XML output option |
Date: | 2023-03-14 17:40:25 |
Message-ID: | 2752578.1678815625@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> writes:
> [ v22-0001-Add-pretty-printed-XML-output-option.patch ]
I poked at this for awhile and ran into a problem that I'm not sure
how to solve: it misbehaves for input with embedded DOCTYPE.
regression=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' as text indent);
xmlserialize
--------------
<!DOCTYPE a>+
<a></a> +
(1 row)
regression=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' as text indent);
xmlserialize
--------------
(1 row)
The bad result for CONTENT is because xml_parse() decides to
parse_as_document, but xmlserialize_indent has no idea that happened
and tries to use the content_nodes list anyway. I don't especially
care for the laissez faire "maybe we'll set *content_nodes and maybe
we won't" API you adopted for xml_parse, which seems to be contributing
to the mess. We could pass back more info so that xmlserialize_indent
knows what really happened. However, that won't fix the bogus output
for the DOCUMENT case. Are we perhaps passing incorrect flags to
xmlSaveToBuffer?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-03-14 17:45:21 | DROP DATABASE is interruptible |
Previous Message | Jeff Davis | 2023-03-14 17:10:42 | Re: ICU locale validation / canonicalization |