Improve docs syntax checking and enable it in the meson build

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Improve docs syntax checking and enable it in the meson build
Date: 2025-10-07 13:12:56
Message-ID: CAN55FZ1qzoDcaKqsR3DwE=X6FL+wpm+=KLvH6ahrRXNhjU53DQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

The Meson build did not include tab and non-breaking space checks for
the docs. The attached patch adds these checks and includes a few
related improvements.

This topic was previously discussed towards end of the another thread
[1], but it was decided that it would be better to have a separate
thread for it, so I am continuing the discussion here.

These checks were previously done in the Makefile:

```
# tabs are harmless, but it is best to avoid them in SGML files
check-tabs:
@( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
(echo "Tabs appear in SGML/XML files" 1>&2; exit 1)

# Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
# Use perl command because non-GNU grep or sed could not have hex
escape sequence.
check-nbsp:
@ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END
{exit($$n>0)}' \
$(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdir)/images/*.xsl) ) || \
(echo "Non-breaking spaces appear in SGML/XML files" 1>&2; exit 1)
```

I moved these checks to a new Perl script called sgml_syntax_check.pl.
This script can also perform xmllint validation (when possible).

Here is a summary of the changes:

1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp,
and xmllint validation checks.
1.1 - It is registered as the sgml_syntax_check test in the Meson build.
1.2 - These checks are run when executing 'make check' or 'meson test
sgml_syntax_check' commands.
1.3 - During the creation of postgres-full.xml, the script performs
tab and nbsp checks. The xmllint check is skipped there, since
validation is already handled by the --valid option. So, we do not run
the same check twice.

2 - The sgml_syntax_check test runs by default in the Meson build.
2.1 - Tab and nbsp checks always run.
2.2 - The xmllint validation and the test are skipped if the DocBook
can not be found. I was not able to achieve the same behavior in the
autoconf build, so the test is not run by default there. The Make
build continues to work as before, you can run the checks manually via
make check in doc/src/sgml.

[1] https://www.postgresql.org/message-id/flat/CACJufxFgAh1--EMwOjMuANe%3DVTmjkNaZjH%2BAzSe04-8ZCGiESA%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v5-0001-Improve-docs-syntax-checking.patch text/x-patch 7.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Viktor Holmberg 2025-10-07 13:55:49 Re: Allow ON CONFLICT DO UPDATE to return EXCLUDED values
Previous Message Devrim Gündüz 2025-10-07 13:10:51 Re: git head build failure