Quick Links

Re: Regression with large XML data input

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Erik Wienhold <ewie(at)ewie(dot)name>
Subject:	Re: Regression with large XML data input
Date:	2025-07-24 03:28:38
Message-ID:	1569825.1753327718@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> A customer has reported a regression with the parsing of rather large
> XML data, introduced by the set of backpatches done with f68d6aabb7e2
> & friends.

Bleah.

> Switching back to the previous code, where we rely on
> xmlParseBalancedChunkMemory() fixes the issue.

Yeah, just reverting these commits might be an acceptable answer,
since the main point was to work around a bleeding-edge bug:

>> * Early 2.13.x releases of libxml2 contain a bug that causes
>> xmlParseBalancedChunkMemory to return the wrong status value in some
>> cases. This breaks our regression tests. While that bug is now fixed
>> upstream and will probably never be seen in any production-oriented
>> distro, it is currently a problem on some more-bleeding-edge-friendly
>> platforms.

Presumably that problem is now gone, a year later. The other point
about

>> * xmlParseBalancedChunkMemory is considered to depend on libxml2's
>> semi-deprecated SAX1 APIs, and will go away when and if they do.

is still hypothetical I think. But we might want to keep this bit:

>> While here, avoid allocating an xmlParserCtxt in DOCUMENT parse mode,
>> since that code path is not going to use it.

regards, tom lane

In response to

Regression with large XML data input at 2025-07-24 03:12:28 from Michael Paquier

Responses

Re: Regression with large XML data input at 2025-07-24 04:32:34 from Michael Paquier
Re: Regression with large XML data input at 2025-07-24 18:10:29 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Richard Guo	2025-07-24 03:40:50	Re: Pathify RHS unique-ification for semijoin planning
Previous Message	Richard Guo	2025-07-24 03:21:30	Re: Eager aggregation, take 3