WIP: Analyze whether our docs need more granular refentries.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: WIP: Analyze whether our docs need more granular refentries.
Date: 2022-10-13 21:06:31
Message-ID: CADkLM=ecedUyx9uFgQA=Bg4-kE3i7KFA6UUEhFmrvxCPrsim0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In reviewing another patch, I noticed that the documentation had an xref to
a fairly large page of documentation (create_table.sgml), and I wondered if
that link was chosen because the original author genuinely felt the entire
page was relevant, or merely because a more granular link did not exist at
the time, and this link had been carried forward since then while the
referenced page grew in complexity.

In the interest of narrowing the problem down to a manageable size, I wrote
a script (attached) to find all xrefs and rank them by criteria[1] that I
believe hints at the possibility that the xrefs should be more granular
than they are.

I intend to use the script output below as a guide for manually reviewing
the references and seeing if there are opportunities to guide the reader to
the relevant section of those pages.

In case anyone is curious, here is a top excerpt of the script output:

file_name link_name link_count
line_count num_refentries
--------------------------------- ---------------------------- ----------
---------- --------------
ref/psql-ref.sgml app-psql 20
5215 1
ecpg.sgml ecpg-sql-allocate-descriptor 4
10101 17
ref/create_table.sgml sql-createtable 23
2437 1
ref/select.sgml sql-select 23
2207 1
ref/create_function.sgml sql-createfunction 30
935 1
ref/alter_table.sgml sql-altertable 12
1776 1
ref/pg_dump.sgml app-pgdump 11
1545 1
ref/pg_basebackup.sgml app-pgbasebackup 11
1008 1
ref/create_type.sgml sql-createtype 10
1029 1
ref/create_index.sgml sql-createindex 9
999 1
ref/postgres-ref.sgml app-postgres 10
845 1
ref/copy.sgml sql-copy 7
1081 1
ref/create_role.sgml sql-createrole 13
511 1
ref/grant.sgml sql-grant 13
507 1
ref/create_foreign_table.sgml sql-createforeigntable 14
455 1
ref/insert.sgml sql-insert 8
792 1
ref/pg_ctl-ref.sgml app-pg-ctl 8
713 1
ref/create_trigger.sgml sql-createtrigger 7
777 1
ref/set.sgml sql-set 15
332 1
ref/create_aggregate.sgml sql-createaggregate 6
805 1
ref/initdb.sgml app-initdb 8
588 1
ref/create_policy.sgml sql-createpolicy 7
655 1
dblink.sgml contrib-dblink-connect 1
2136 19
ref/create_subscription.sgml sql-createsubscription 9
472 1

Some of these will clearly be false positives. For instance, dblink.sgml
and ecpg.sgml have a lot of refentries, but they seem to lack a global
"top" refentry which I assumed would be there.

On the other hand, I have to wonder if the references to psql might be to a
specific feature of the tool, and perhaps we can create refentries to those.

[1] The criteria is: must be first refentry in file, file must be at least
200 lines long, then rank by lines*references, 2x for referencing the top
refentry when others exist

Attachment Content-Type Size
xref-analysis.sh application/x-shellscript 5.2 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-10-13 21:26:35 Re: GUC values - recommended way to declare the C variables?
Previous Message Nathan Bossart 2022-10-13 21:00:52 Re: libpq support for NegotiateProtocolVersion