Re: [PATCH] Clarify the comments about varlena header encoding

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Clarify the comments about varlena header encoding
Date: 2022-09-12 04:47:07
Message-ID: CAFBsxsGb6NwkYFLdGLRfMAf7CwROSx=yj23j0-9WFG_wsR_RnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 11, 2022 at 5:06 PM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
>
> Hi John,
>
> Many thanks for the feedback!
>
> > Or put the now-standard 0b in front.
>
> Good idea.

Now that I look at the results, though, it's distracting and not good
for readability. I'm not actually sure we need to do anything here,
but I am somewhat in favor of putting [un]aligned in parentheses, as
already discussed. Even there, in the first email you said:

> Also one can read this as "aligned, uncompressed
> data" (which again is wrong).

I'm not sure it rises to the level of "wrong", because a blob of bytes
immediately after an aligned uint32 is in fact aligned. The important
thing is: a zero byte is always either a padding byte or part of a
4-byte header, so it's the alignment of the header we really care
about.

> > I think the problem is ambiguity about what a "toast pointer" is. This comment:
> >
> > * struct varatt_external is a traditional "TOAST pointer", that is, the
>
> Right. The comment for varatt_external says that it IS a TOAST
> pointer.

Well, the word "traditional" is not very informative, but it is there.
And afterwards there is also varatt_indirect, varatt_expanded, and
varattrib_1b_e, which all mention "TOAST pointer".

> However the comments for varlena headers bit layout
> implicitly include it into a TOAST pointer, which contradicts the
> previous comments. I suggest we fix this ambiguity by explicitly
> enumerating the type tag in the comments for varlena headers.

- * 10000000 1-byte length word, unaligned, TOAST pointer
+ * 0b10000000 1-byte length word (unaligned), type tag, TOAST pointer

This is distracting from the point of this whole comment, which, I
will say again is: How to look at the first byte to determine what
kind of varlena we're looking at. There is no reason to mention the
type tag here, at all.

- * In TOAST pointers the va_tag field (see varattrib_1b_e) is used to discern
- * the specific type and length of the pointer datum.
+ * For the TOAST pointers the type tag (see varattrib_1b_e.va_tag field) is
+ * used to discern the specific type and length of the pointer datum.

I don't think this clarifies anything, it's just a rephrasing.

More broadly, I think the description of varlenas in this header is at
a kind of "local maximum" -- minor adjustments are more likely to make
it worse. To significantly improve clarity might require a larger
rewriting, but I'm not personally interested in taking part in that.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-09-12 05:03:47 Re: [PATCH] initdb: do not exit after warn_on_mount_point
Previous Message houzj.fnst@fujitsu.com 2022-09-12 04:26:48 RE: why can't a table be part of the same publication as its schema