From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: split func.sgml to separated individual sgml files |
Date: | 2025-06-24 03:34:56 |
Message-ID: | CACJufxH8=BL98wAqcx-xf-fiCzw7-NRfsQT7RdAPrBTb5=-kZw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 20, 2025 at 10:16 AM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> In short, ready to commit (see last paragraph below however), but the committer will need to run the python script at the time of commit on the then-current tree.
>
hi.
more explanation, since the python script seems quite large...
each <sect1 id="functions-XXX"> in doc/src/sgml/func.sgml
corresponds to each individual section in [1].
each <sect1 id="functions-XXX"> within func.sgml is unique.
if you try to rename it, having two <sect1 id="functions-logical">
will error out saying something like:
../../Desktop/pg_src/src6/postgres/doc/src/sgml/postgres.sgml:199:
element sect1: validity error : ID functions-logical already defined
see [2] also.
Based on this, we can use the literal string <sect1 id="functions-XXX"> to
perform pattern matching and identify the line numbers that mark the start and
end of each <sect1> section.
The polished v2 python script use the following steps for splitting func.sgml
into several pieces:
0. For each 9.X section listed in [1], create an empty SGML file to hold the
corresponding content.
1. Use the pattern <sect1 id="functions-XXX"> to locate the starting and ending
line number of each section in func.sgml
2. Copy func.sgml all the content block (<sect1>)
<sect1 id="functions-XXX">
...main content
</sect1>
into the newly created SGML files.
3. Remove the copied content from func.sgml.
4. In func.sgml, insert general entity references [3] to include the newly
created SGML files.
because PG18, and PG17, Chapter 9. Functions and Operators
have the same amount of section (31),
so v1-0001-split_func_sgml.py will work just fine.
but I did some minor changes, therefore v2 attached.
----------------------------------------------------
I used the sed --in-place option [3] to modify and truncate the original large
func.sgml file directly.
I also used the -n and -p options with sed to extract lines from func.sgml
between line X and line Y, as shown in reference [4].
for the attach file:
first run ``python3 v2-0001-split_func_sgml.py``
then run ``git apply v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot``
(`git am` won't work, need to use `git apply`).
[1] https://www.postgresql.org/docs/current/functions.html
[2] https://en.wikipedia.org/wiki/Document_type_definition
[3] https://www.gnu.org/software/sed/manual/html_node/Command_002dLine-Options.html#index-_002di
[4] https://www.gnu.org/software/sed/manual/html_node/Common-Commands.html#index-n-_0028next_002dline_0029
Attachment | Content-Type | Size |
---|---|---|
v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot | application/octet-stream | 3.4 KB |
v2-0001-split_func_sgml.py | text/x-python | 23.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2025-06-24 03:49:41 | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin |
Previous Message | John Naylor | 2025-06-24 03:34:12 | Re: Improve CRC32C performance on SSE4.2 |