| From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
|---|---|
| To: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Chapman Flack <chap(at)anastigmatix(dot)net>, vignesh C <vignesh21(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org> |
| Subject: | Re: [PATCH] Add CANONICAL option to xmlserialize |
| Date: | 2026-05-26 09:46:44 |
| Message-ID: | 3b0b381b-a89f-4bb1-a1d3-25b2ba7b8907@uni-muenster.de |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 30/03/2026 13:27, Jim Jones wrote:
> On 30/03/2026 11:44, Andrew Dunstan wrote:
>> I note that your function returns xml, whereas Tom's suggestion was for
>> a function returning text. I don't think there was any discussion on the
>> point.
> Indeed, there was no discussion regarding the return type.
>
> My rationale for keeping it as xml was: the output is xml, callers can
> immediately use the xml without casting, and nearly all other xml*
> functions return xml. Is there a direct advantage of having this
> function return text?
After some consideration, I think returning text instead of xml is
indeed the better choice here. The canonical form is a serialization
artifact rather than a document intended for further XML processing.
More practically, since xml has no = operator, the primary use case of
comparing documents requires casting anyway -- returning text is indeed
closer to real-world usage.
I also noticed a correctness issue with database encoding: the C14N 1.1
specification mandates UTF8 output, so xmlC14NDocDumpMemory always
returns UTF8.[1] I added a pg_any_to_server call to convert the output
to the server encoding before returning.
Best, Jim
1 -
https://github.com/GNOME/libxml2/blob/174201f747da93167354287a7599d0b385552599/c14n.c#L1964
| Attachment | Content-Type | Size |
|---|---|---|
| v25-0001-Add-xmlcanonicalize-function.patch | text/x-patch | 32.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andriy Dorokhin | 2026-05-26 09:54:21 | RFC: Boyer-Moore-Horspool optimization for LIKE '%pattern%' searches |
| Previous Message | Imran Zaheer | 2026-05-26 09:39:29 | Re: effective_wal_level is not decreasing after using REPACK (CONCURRENTLY) |