Re: Converting README documentation to Markdown

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Converting README documentation to Markdown
Date: 2024-05-15 12:26:45
Message-ID: C49D656B-B980-48BA-9B80-34466748E7DE@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 13 May 2024, at 09:20, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:

> I started looking through this and immediately found a bunch of tiny problems. (This is probably in part because the READMEs under src/backend/access/ are some of the more complicated ones, but then they are also the ones that might benefit most from better rendering.)

Thanks for looking!

> One general problem is that original Markdown and GitHub-flavored Markdown (GFM) are incompatible in some interesting aspects.

That's true, but virtually every implementation of Markdown in practical use
today is incompatible with Original Markdown.

Reading my email I realize I failed to mention the markdown platforms I was
targeting (and thus flavours), and citing Gruber made it even more confusing.
For online reading I verified with Github and VS Code since they have a huge
market presence. For offline work I targeted rendering with pandoc since we
already have a dependency on it in the tree. I don't think targeting the
original Markdown implementation is useful, or even realistic.

Another aspect of platform/flavour was to make the markdown version easy to
maintain for hackers writing content. Requiring the minimum amount of markup
seems like the developer-friendly way here to keep productivity as well as
document quality high.

Most importantly though, I targeted reading the files as plain text without any
rendering. We keep these files in text format close to the code for a reason,
and maintaining readability as text was a north star.

> For example, the line
>
> A split initially marks the left page with the F_FOLLOW_RIGHT flag.
>
> is rendered by GFM as you'd expect. But original Markdown converts it to
>
> A split initially marks the left page with the F<em>FOLLOW</em>RIGHT
> flag.
>
> This kind of problem is pervasive, as you'd expect.

Correct, but I can't imagine that we'd like to wrap every instance of a name
with underscores in backticks like `F_FOLLOW_RIGHT`. There are very few
Markdown implementations which don't support underscores like this (testing
just now on the top online editors and sites providing markdown editing I
failed to find a single one).

> Also, the READMEs often do not indent lists in a non-ambiguous way. For example, if you look into src/backend/optimizer/README, section "Join Tree Construction", there are two list items, but it's not immediately clear which paragraphs belong to the list and which ones follow the list. This also interacts with the previous point. The resulting formatting in GFM is quite misleading.

I agree that the rendered version excacerbates this problem. Writing a bullet
point list where each item spans multiple paragraphs indented the same way as
the paragraphs following the list is not helpful to the reader. In these cases
both the markdown and the text version will be improved by indentation.

> There are also various places where whitespace is used for ad-hoc formatting. Consider for example in src/backend/access/gin/README
>
> the "category" of the null entry. These are the possible categories:
>
> 1 = ordinary null key value extracted from an indexable item
> 2 = placeholder for zero-key indexable item
> 3 = placeholder for null indexable item
>
> Placeholder null entries are inserted into the index because otherwise
>
> But this does not preserve the list-like formatting, it just flows it together.

That's the kind of sublists which need to be found as part of this work, and
the items prefixed with a list identifier. In this case, prefixing each row in
the sublist with '-' yields the correct result.

> src/test/README.md wasn't touched by your patch, but it also needs adjustments for list formatting.

I didn't re-indent that one in order to keep the changes to the absolute
minimum, since I considered the rendered version passable even if not
particularly good. Re-indenting files like this will for sure make the end
result better, as long as the changes keep the text version readability.

> In summary, I think before we could accept this, we'd need to go through this with a fine-toothed comb line by line and page by page to make sure the formatting is still sound.

Absolutely. I've been over every file to ensure they aren't blatantly wrong,
but I didn't want to spend the time if this was immmediately shot down as
something the community don't want to maintain.

> And we'd need to figure out which Markdown flavor to target.

Absolutely, and as I mentioned above, we need to pick based both the final
result (text and rendered) as well as the developer experience for maintaining
this.

--
Daniel Gustafsson

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-05-15 12:28:18 Re: Fix src/test/subscription/t/029_on_error.pl test when wal_debug is enabled
Previous Message Marcos Pegoraro 2024-05-15 12:04:36 </replaceable> in parentesis is not usual on DOCs