Re: [PATCH] Clarify the comments about varlena header encoding

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Subject: Re: [PATCH] Clarify the comments about varlena header encoding
Date: 2022-09-06 10:18:52
Message-ID: CAJ7c6TO+t3tmvTp=U-FiMsUU0YBLZu+CV+WQ8sZjcVjcZSfQrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi John,

Thanks for the feedback.

> Maybe "00xxxxxx 4-byte length word (aligned)," is more clear about
> what is aligned. Also, adding all those xxx's obscures the point that
> we only need to examine one byte to figure out what to do next.

IMO "00xxxxxx 4-byte length word" is still confusing. One can misread
this as a 00-xx-xx-xx hex value, where the first byte (not two bits)
is 00h.

> The patch has:
>
> + * In the third case the va_tag field (see varattrib_1b_e) is used to discern
> + * the specific type and length of the pointer datum. On disk the "xxx" bits
> + * currently always store sizeof(varatt_external) + 2.
>
> ...so not sure where 17 came from.

Right, AFTER applying the patch it's clear that it's actually 18
bytes. Currently the comment says "1 byte followed by a TOAST pointer
(16 bytes)" which is wrong.

> - * 10000000 1-byte length word, unaligned, TOAST pointer
> + * 10000000 xxxxxxxx, TOAST pointer (struct varatt_external)
>
> This implies that the header is two bytes, which is not accurate. That
> next byte is a type tag:
> [...]
> ...and does not always represent the on-disk length:

Well, the comments don't say what is the header and what is the type
tag. They merely describe the bit layouts. The patch doesn't seem to
make things worse in this respect. Do you think we should address this
too? I suspect that describing the difference between the header and
the type tag here will create even more confusion.

> And I don't think the new comments referring to "third case", "first
> two cases", etc make it easier to follow.

Maybe you are right. I'm open to suggestions.

--
Best regards,
Aleksander Alekseev

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-09-06 10:23:12 Re: Schema variables - new implementation for Postgres 15
Previous Message Amit Kapila 2022-09-06 09:55:10 Re: Handle infinite recursion in logical replication setup