BUG #18274: Error 'invalid XML content'

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: d(dot)koval(at)postgrespro(dot)ru
Subject: BUG #18274: Error 'invalid XML content'
Date: 2024-01-06 22:20:36
Message-ID: 18274-98d16bc03520665f@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18274
Logged by: Dmitry Koval
Email address: d(dot)koval(at)postgrespro(dot)ru
PostgreSQL version: 16.1
Operating system: Ubuntu 22.04
Description:

Hello!
It's easy to get an 'invalid XML content' error when using UTF-8 special
characters:

>select length((repeat('ї', 10 * 1000 * 1000))::xml::text::bytea);
ERROR: invalid XML content
DETAIL: line 1: xmlSAX2Characters: huge text node
їїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїїї

This error is not directly related to UTF-8, since this query is processed
without an error:

>select length((repeat('a', 100 * 1000 * 1000))::xml::text::bytea);
length
-----------
100000000
(1 row)

The problem is in the libxml2 library (in xmlParseBalancedChunkMemory
function), which is used in PostgreSQL and does not support the
XML_PARSE_HUGE flag.
There have been attempts to correct this problem [1].
Apparently they were unsuccessful because libxml2 technical support refused
to fix the xmlParseBalancedChunkMemory function.

I'd like to know what the community's opinion is regarding this error:
1) the error is correct and does not need to be corrected;
2) corrections should be made in the libxml2 library;
3) corrections should be made in PostgreSQL (maybe need to stop using the
xmlParseBalancedChunkMemory function or make other corrections);
4) ...?

[1] https://gitlab.gnome.org/GNOME/libxml2/-/issues/167
----
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Noah Misch 2024-01-06 22:44:48 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Previous Message Peter Geoghegan 2024-01-06 21:41:23 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()