Re: WIP - xmlvalidate implementation from TODO list

From: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>
To: Marcos Magueta <maguetamarcos(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP - xmlvalidate implementation from TODO list
Date: 2026-01-14 10:09:59
Message-ID: 2898f090-d9cf-475c-940c-a99da4a308f1@uni-muenster.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14/01/2026 02:23, Marcos Magueta wrote:
> Please follow the updated version attached.

A few comments:

== grammar ==

In gram.y you restricted CREATE XMLSCHEMA to Sconst:

DO $$
DECLARE xsd text := '<foo></foo>';
BEGIN
CREATE XMLSCHEMA person_schema AS xsd;
END $$;
ERROR: syntax error at or near "xsd"
LINE 4: CREATE XMLSCHEMA person_schema AS xsd;

Any particular reason for that? If not, take a look at other options,
e.g. a_expr

== pg_xmlschema ==

Why did you choose text over xml for schemadata?

postgres=# \d pg_xmlschema
Table "pg_catalog.pg_xmlschema"
Column | Type | Collation | Nullable | Default
-----------------+-----------+-----------+----------+---------
oid | oid | | not null |
schemaname | name | | not null |
schemanamespace | oid | | not null |
schemaowner | oid | | not null |
schemadata | text | C | not null |
schemaacl | aclitem[] | | |
Indexes:
"pg_xmlschema_oid_index" PRIMARY KEY, btree (oid)
"pg_xmlschema_name_nsp_index" UNIQUE CONSTRAINT, btree (schemaname,
schemanamespace)

== psql command to display and list xml schemas ==

Not a requirement for this patch (specially not at the current stage),
but you should add it to your TODO list.

\dz
\dz foo
\dz+ foo

* z here is just an example

== tab completion ==

CREATE <TAB> should suggest XMLSCHEMA and CREATE XML<TAB> should
autocomplete CREATE XMLSCHEMA. The same applies for DROP XMLSCHEMA [IF
EXISTS], where it should additionally list the available schemas after
DROP XMLSCHEMA <TAB>.

== white-space warnings ==

The patch does not apply cleanly:

/home/jim/patches/xmlvalidate/0002-xmlschema-catalog-and-xmlvalidate.patch:594:
indent with spaces.
Oid schemanamespace,
/home/jim/patches/xmlvalidate/0002-xmlschema-catalog-and-xmlvalidate.patch:595:
indent with spaces.
Oid schemaowner,
/home/jim/patches/xmlvalidate/0002-xmlschema-catalog-and-xmlvalidate.patch:596:
indent with spaces.
const char *schemadata,
/home/jim/patches/xmlvalidate/0002-xmlschema-catalog-and-xmlvalidate.patch:597:
indent with spaces.
bool if_not_exists,
/home/jim/patches/xmlvalidate/0002-xmlschema-catalog-and-xmlvalidate.patch:598:
indent with spaces.
bool quiet)
warning: squelched 139 whitespace errors
warning: 144 lines add whitespace errors.

== file naming ==

Your patch suggests that it is part of a patch set, from which 0001 is
missing. In case you meant a version 2 of the previous patch, a better
format would be

v2-0001-xmlschema-catalog-and-xmlvalidate.patch

which can be generated with

$ git format-patch -1 -v2

== xml_1.out not updated ==

After every change in xml.sql you must create an equivalent file for a
postgres compiled without --with-libxml, and put the changes in
xml_1.out.[1]

== corrupt pg_dump ==

I understand we agreed to work on XMLVALIDATE only after CREATE
XMLSCHEMA is settled, but since the code is partially already there, you
might wanna take a look at pg_dump. It is not serialising the CREATE
XMLSCHEMA statements:

$ /usr/local/postgres-dev/bin/psql postgres
psql (19devel)
Type "help" for help.

postgres=# CREATE XMLSCHEMA person_schema AS '<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>';
CREATE XMLSCHEMA
postgres=# CREATE VIEW v AS
SELECT XMLVALIDATE(DOCUMENT '<bar></bar>'::xml
ACCORDING TO XMLSCHEMA person_schema);
CREATE VIEW
postgres=# \q

$ /usr/local/postgres-dev/bin/pg_dump postgres
--
-- PostgreSQL database dump
--

\restrict WLaIQWmNJVW2yc4Jv5W81qh2TZGHnupJlpl4Urm4Pp6Ku3VPyH5dO3ReFc4LMmd

-- Dumped from database version 19devel
-- Dumped by pg_dump version 19devel

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET transaction_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: test_xmlschema_ns; Type: SCHEMA; Schema: -; Owner: jim
--

CREATE SCHEMA test_xmlschema_ns;

ALTER SCHEMA test_xmlschema_ns OWNER TO jim;

--
-- Name: v; Type: VIEW; Schema: public; Owner: jim
--

CREATE VIEW public.v AS
SELECT XMLVALIDATE(DOCUMENT '<bar></bar>'::xml ACCORDING TO XMLSCHEMA
person_schema) AS "xmlvalidate";

ALTER VIEW public.v OWNER TO jim;

--
-- PostgreSQL database dump complete
--

\unrestrict WLaIQWmNJVW2yc4Jv5W81qh2TZGHnupJlpl4Urm4Pp6Ku3VPyH5dO3ReFc4LMmd

Take a look at pg_dump.c. You might need a new function, e.g.
dumpXmlSchemas(Archive *fout, const SchemaInfo *schemaInfo)

== patch structure ==

To make the review a bit easier, I suggest to split this patch into a
patch set with **at least 4** smaller patches - the more seasoned
hackers here might correct me if I am wrong. For instance:

0001 - CREATE XMLSCHEMA (code + tests + documentation)
0002 - pg_dump changes to output CREATE XMLSCHEMA
0003 - psql tab completion + new command to display and list xml schemas
0004 - XMLVALIDATE (code + tests + documentation)

Thanks for working on this!

Best, Jim

[1] https://cirrus-ci.com/task/4872456290172928?logs=check_world#L126

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Tselebrovskiy 2026-01-14 10:11:11 Re: 001_password.pl fails with --without-readline
Previous Message zengman 2026-01-14 10:09:09 Re:Update some comments for fasthash