Re: WIP - xmlvalidate implementation from TODO list

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Marcos Magueta <maguetamarcos(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: 2026-01-01 08:25:49
Message-ID: CALdSSPhFzYCp=Aa8AAboz6TQaTmjWciQGfrEJQeOOO+0pD1GGw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 1 Jan 2026, 01:27 Marcos Magueta, <maguetamarcos(at)gmail(dot)com> wrote:

> Hello again!
>
> Is there any interest in this? I understand PostgreSQL has bigger fish to
> fry, but I would like to at least know; in case this was just forgotten.
>
> Regards!
>
> Em sex., 19 de dez. de 2025 às 00:25, Marcos Magueta <
> maguetamarcos(at)gmail(dot)com> escreveu:
>
>> Hello again!
>>
>> I took some time to actually finish this feature. I think the answers
>> for the previous questions are now clearer. I checked the
>> initialization and the protections are indeed in place since commit
>> a4b0c0aaf093a015bebe83a24c183e10a66c8c39, which specifically states:
>>
>> > Prevent access to external files/URLs via XML entity references.
>>
>> > xml_parse() would attempt to fetch external files or URLs as needed to
>> > resolve DTD and entity references in an XML value, thus allowing
>> > unprivileged database users to attempt to fetch data with the privileges
>> > of the database server. While the external data wouldn't get returned
>> > directly to the user, portions of it could be exposed in error messages
>> > if the data didn't parse as valid XML; and in any case the mere ability
>> > to check existence of a file might be useful to an attacker.
>> >
>> > The ideal solution to this would still allow fetching of references that
>> > are listed in the host system's XML catalogs, so that documents can be
>> > validated according to installed DTDs. However, doing that with the
>> > available libxml2 APIs appears complex and error-prone, so we're not
>> going
>> > to risk it in a security patch that necessarily hasn't gotten wide
>> review.
>> > So this patch merely shuts off all access, causing any external fetch to
>> > silently expand to an empty string. A future patch may improve this.
>>
>> With that, the obvious affordance on the xmlvalidate implementation
>> was to not rely on external schema sources on the host
>> catalog. Therefore the implementation relies solely on expressions
>> that necessarily evaluate to a schema in plain text.
>>
>> I added the requested documentation and a bunch of tests for each
>> scenario. I would appreciate another round of reviews whenever someone
>> has the time and patience.
>>
>> At last, to nourish the curiosity: I had issues with make check, as
>> stated above on the e-mail thread. These got resolved when I changed
>> `execl` to `execlp` on `pg_regress.c`. I of course did not commit
>> such, but more people I know have had the very same issue while
>> relying on immutable package managers.
>>
>

Hi!
First of all, please do not top post 🙏 . Use down-posting.

About general interest in feature - I suspect that we as a community
generally interested in implementing items from TODO list. This feature
also increases SQL standard compatibility. But I am myself not a big
SQL/XML user, so I can only give limited review here. I also did not have
much time last month. I will try to find my cycles to give another look
here.

>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nicolas Adenis-Lamarre 2026-01-01 09:34:26 Re: Planner : anti-join on left joins
Previous Message Soumya S Murali 2026-01-01 08:09:32 Re: [PATCH] Expose checkpoint reason to completion log messages.