| From: | Marcos Magueta <maguetamarcos(at)gmail(dot)com> |
|---|---|
| To: | Kirill Reshke <reshkekirill(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: WIP - xmlvalidate implementation from TODO list |
| Date: | 2026-01-02 18:07:24 |
| Message-ID: | 89DE974B-F318-4D0A-A60B-51EDE84054E2@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On 1 Jan 2026, at 05:25, Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
>
>
>
> On Thu, 1 Jan 2026, 01:27 Marcos Magueta, <maguetamarcos(at)gmail(dot)com <mailto:maguetamarcos(at)gmail(dot)com>> wrote:
>> Hello again!
>>
>> Is there any interest in this? I understand PostgreSQL has bigger fish to fry, but I would like to at least know; in case this was just forgotten.
>>
>> Regards!
>>
>> Em sex., 19 de dez. de 2025 às 00:25, Marcos Magueta <maguetamarcos(at)gmail(dot)com <mailto:maguetamarcos(at)gmail(dot)com>> escreveu:
>>> Hello again!
>>>
>>> I took some time to actually finish this feature. I think the answers
>>> for the previous questions are now clearer. I checked the
>>> initialization and the protections are indeed in place since commit
>>> a4b0c0aaf093a015bebe83a24c183e10a66c8c39, which specifically states:
>>>
>>> > Prevent access to external files/URLs via XML entity references.
>>>
>>> > xml_parse() would attempt to fetch external files or URLs as needed to
>>> > resolve DTD and entity references in an XML value, thus allowing
>>> > unprivileged database users to attempt to fetch data with the privileges
>>> > of the database server. While the external data wouldn't get returned
>>> > directly to the user, portions of it could be exposed in error messages
>>> > if the data didn't parse as valid XML; and in any case the mere ability
>>> > to check existence of a file might be useful to an attacker.
>>> >
>>> > The ideal solution to this would still allow fetching of references that
>>> > are listed in the host system's XML catalogs, so that documents can be
>>> > validated according to installed DTDs. However, doing that with the
>>> > available libxml2 APIs appears complex and error-prone, so we're not going
>>> > to risk it in a security patch that necessarily hasn't gotten wide review.
>>> > So this patch merely shuts off all access, causing any external fetch to
>>> > silently expand to an empty string. A future patch may improve this.
>>>
>>> With that, the obvious affordance on the xmlvalidate implementation
>>> was to not rely on external schema sources on the host
>>> catalog. Therefore the implementation relies solely on expressions
>>> that necessarily evaluate to a schema in plain text.
>>>
>>> I added the requested documentation and a bunch of tests for each
>>> scenario. I would appreciate another round of reviews whenever someone
>>> has the time and patience.
>>>
>>> At last, to nourish the curiosity: I had issues with make check, as
>>> stated above on the e-mail thread. These got resolved when I changed
>>> `execl` to `execlp` on `pg_regress.c`. I of course did not commit
>>> such, but more people I know have had the very same issue while
>>> relying on immutable package managers.
>
>
> Hi!
> First of all, please do not top post 🙏 . Use down-posting.
>
> About general interest in feature - I suspect that we as a community generally interested in implementing items from TODO list. This feature also increases SQL standard compatibility. But I am myself not a big SQL/XML user, so I can only give limited review here. I also did not have much time last month. I will try to find my cycles to give another look here.
Thank you very much for reaching back. Sorry about the bad e-mail etiquette, hopefully it’s corrected now.
About the patch, let me know if you find the time to review!
Thanks once again!
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bryan Green | 2026-01-02 18:12:22 | Re: Use Python "Limited API" in PL/Python |
| Previous Message | Bryan Green | 2026-01-02 17:40:11 | Re: Use Python "Limited API" in PL/Python |