Re: WIP - xmlvalidate implementation from TODO list

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
Cc: Marcos Magueta <maguetamarcos(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: 2026-03-30 20:28:39
Message-ID: CAFj8pRB3d_fREmgzT1GQwG_wfR1brxyTMVsAV=bbpOzvPkauLg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

ne 15. 3. 2026 v 13:58 odesílatel Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
napsal:

> Hi Marcos
>
> On 15/03/2026 05:25, Marcos Magueta wrote:
> > I was thinking about the idea of managing the catalogs for read and
> > write, and I'm coming around to the idea of predefined roles after all.
> > Relying on conventional namespace-level ACLs for this turns out to be
> > impractical. With the normal ACL, a schema is object agnostic, so
> > there's no clean way to selectively restrict XML schema creation without
> > also affecting other objects in the sam enamespace. A simple scenario
> > like limiting who can write already gets messy. I did consider RLS on
> > the catalog, but that would be unprecedented for a pg_* table and would
> > break assumptions throughout the system, like pg_dump, dependency
> > tracking, syscache lookups... blah!
> >
> > That said, I'd like to hear from more people on this before committing
> > to an approach, assuming there's still legitimate interest in moving
> > this work forward.
>
>
> I guess we can assume that everything added to the official todo list is
> of interest for the community -- at least I do :).
>
>
> > On the potential CPU burn from validation: I think in practice it's
> > comparable to what you'd get from a complex index, heavy check
> > constraint, or trigger function. However, the nature of the input (and I
> > mean the XML schema definitions as plain text here), likely coming from
> > the application layer, sets a warrant for extra caution I guess.
> > Limiting the depth and size of both the schema and the document being
> > validated would reduce compatibility, but goes a long way in preventing
> > resource exhaustion, so it's a fairly trivial option to implement.
>
>
> I took the liberty to add Pavel to this thread. He has way more
> experience than me in this part of the code, and perhaps he can share
> his opinion on the predefined roles for XML schemas and his impressions
> on the patch as a whole.
>

I checked db2 doc, and if I understand their doc, the XML schema is
identified by "relational identifier" SQLschema.name

So taking XML schema as catalog object is the correct analogy and using acl
looks to me correct.

But what is different (patch and db2), one relational identifier can
identify a group of XML schemas. So there is relation 1:N not 1:1. You can
see the REGISTER XMLSCHEMA command.

The schema registration is different on MSSQL where it is more similar to
some local cache, and schema is identified only by uri. In this case using
ACL can be messy, and I can imagine having some dedicated role that can
register a new xml schema. But MSSQL doesn't support SQL/XML XMLVALIDATE
function.

Both concepts are workable, and I have no strong preference for one or
second (maybe DB2 concept is better for Postgres, probably DB2 concept is
closer to SQL/XML). But if we use relational identifiers, then it should be
consistent with other usage of relational identifiers - there ACL should be
used.

Regards

Pavel

> Best, Jim
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2026-03-30 21:00:47 Re: MinGW CI tasks fail / timeout
Previous Message Kuba Knysiak 2026-03-30 20:19:13 Re: Adding per backend commit and rollback counters