Re: XML element with special characters can be created, serialized, but not deserialized

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sergiu Ignat <sergiu(at)bitsoftware(dot)ro>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: XML element with special characters can be created, serialized, but not deserialized
Date: 2023-05-16 15:21:00
Message-ID: 2650678.1684250460@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Sergiu Ignat <sergiu(at)bitsoftware(dot)ro> writes:
> I am using PostgreSQL 13.8 and I think that I found an issue with XML
> serialization and deserialization.

Hmm. The root cause here seems to be that escape_xml() thinks it
doesn't need to escape ASCII control characters, other than CR (\r).
Which is a bit backwards, because after some googling I conclude that
XML 1.1 requires all C0 and C1 control characters to be represented as
numeric escapes *except* CR, LF, and TAB [1].

What we probably ought to do is escape all except LF and TAB.
However, I'm a bit hesitant to back-patch such a behavioral change.
Maybe change this in HEAD (v16) only?

regards, tom lane

[1] https://www.w3.org/International/questions/qa-controls

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-05-16 19:23:34 Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
Previous Message Tom Lane 2023-05-16 14:23:03 Re: BUG #17935: Incorrect memory access in fuzzystrmatch/difference()