Re: WIP - xmlvalidate implementation from TODO list

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: Marcos Magueta <maguetamarcos(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: 2026-01-23 12:19:25
Message-ID: 08052569-9384-41b5-bcb7-33929fcc6c71@uni-muenster.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/01/2026 21:44, Marcos Magueta wrote:
>> Any particular reason for that? If not, take a look at other options,
> e.g. a_expr
> No particular reason apart from it being simpler since I didn't need to
> invoke an execution at the cmd. Changed it now.
>
>> Why did you choose text over xml for schemadata?
> My original thought was that XML schemas require additional validation
> in contrast to normal XML, but it being additive, we would have
> redundant checks. But in reconsideration, perhaps keeping the field with
> an XML type is more intuitive for anyone introspecting over the catalog.
> Also applied the change on the latest version of the patch.

Data type for schemadata in pg_xmlschema is now xml.

postgres=# \d pg_xmlschema
Table "pg_catalog.pg_xmlschema"
Column | Type | Collation | Nullable | Default
-----------------+-----------+-----------+----------+---------
oid | oid | | not null |
schemaname | name | | not null |
schemanamespace | oid | | not null |
schemaowner | oid | | not null |
schemadata | xml | | not null |
schemaacl | aclitem[] | | |
Indexes:
"pg_xmlschema_oid_index" PRIMARY KEY, btree (oid)
"pg_xmlschema_name_nsp_index" UNIQUE CONSTRAINT, btree (schemaname,
schemanamespace)

I agree it's more intuitive this way. It also facilitates function calls
that require the parameter to be xml, e.g. xmlserialize

postgres=# CREATE XMLSCHEMA x AS
'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"><xs:element
name="duplicate" type="xs:string"/></xs:schema>';
CREATE XMLSCHEMA

postgres=# SELECT xmlserialize(DOCUMENT schemadata AS text INDENT) FROM
pg_xmlschema;
xmlserialize
---------------------------------------------------------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">+
<xs:element name="duplicate" type="xs:string"/> +
</xs:schema>
(1 row)

> I noticed DefineXmlSchema() calls IsThereXmlSchemaInNamespace() right
> after XmlSchemaCreate() returns a valid OID. Since XmlSchemaCreate()
> already inserted the tuple into the catalog (via CatalogTupleInsert at
> pg_xmlschema.c:166), wouldn't SearchSysCacheExists2() find it and always
> throw "already exists"? We all tested the original code and it worked
> fine, so I'm missing something about syscache visibility or timing; that
> was an early function I did to check for duplicates that ended up in the
> wrong place. I removed the call (and function) as I judged it to be
> redundant (the duplicate check already happens inside
> XmlSchemaCreate()), but is there something subtle about intra-command
> visibility I'm not understanding? If anyone knows, please let me know.

I couldn't find any IsThereXmlSchemaInNamespace call in DefineXmlSchema
in the current version, so I cannot say much here. But I agree that the
a further check is not necessary, since XmlSchemaCreate is already doing it.

> Also, I added tab completion on psql and fixed pg_dump.

Nice. pg_dump now exports CREATE XMLSCHEMA statements.

Tab completion for CREATE, ALTER, and DROP XMLSCHEMA now also works.

A few other comments

== patch version ==

You forgot to include the version to the patch name.

For instance, instead of
0001-Add-CREATE-ALTER-DROP-XMLSCHEMA-DDL-commands.patch the file could
be named v3-0001-Add-CREATE-ALTER-DROP-XMLSCHEMA-DDL-commands.patch

== IS_XMLVALIDATE dependency ==

The patches 0001, 0002, and 0003 depend on IS_XMLVALIDATE, which is only
introduced in 0004, so they cannot be compiled and tested independently.

== permissions ==

In the tests I see you added a few GRANTs to set the visibility of
certain xmlschemas:

GRANT USAGE ON XMLSCHEMA permission_test_schema TO regress_xmlschema_user2

I could not find anything regarding this in the docs. If we are to
support it, shouldn't we add it to grant.sgml?

As I mentioned upthread, I believe that schema registration and usage
should be privilege-controlled, for example via dedicated roles

GRANT pg_read_xmlschemas TO u;
GRANT pg_write_xmlschemas TO u;

What do you think?

But being able to grant or revoke access to a certain xmlschema also has
its appeal :)

Best, Jim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2026-01-23 12:27:11 Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message Soumya S Murali 2026-01-23 12:17:38 Re: Checkpointer write combining