| From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
|---|---|
| To: | Aleksander Alekseev <aleksander(at)tigerdata(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [PATCH] Refactor bytea_sortsupport(), take two |
| Date: | 2025-11-26 07:18:52 |
| Message-ID: | CANWCAZY0ndYj881r_4OW-COj=HvtawXc8hJ5OquL2ma1vYn2mg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Nov 17, 2025 at 5:57 PM Aleksander Alekseev
<aleksander(at)tigerdata(dot)com> wrote:
> > > > - * More generally, it's okay that bytea callers can have NUL bytes in
> > > > - * strings because abbreviated cmp need not make a distinction between
> > > > - * terminating NUL bytes, and NUL bytes representing actual NULs in the
> > > > - * authoritative representation. Hopefully a comparison at or past one
> >
> > > > Why is this gone -- AFAICT it's still true for bytea, no matter what
> > > > file it's located in?
> > >
> > > Actually for bytea_abbrev_convert() this comment makes no sense IMO.
> > > Presumably the reader is aware that '\x0000...' is a valid bytea, and
> > > there is no such thing as "terminating NUL bytes" for bytea.
> >
> > It makes perfect sense to me. The abbreviated datum has terminating
> > NUL bytes if the source length is shorter than 8 bytes. The comment is
> > explaining why comparing bytea abbreviated datums still works. It
> > seems like everything starting with "abbreviated cmp need not ..." is
> > still appropriate for bytea.c.
>
> OK, I returned the comment to bytea.c, but replaced "string" with
> "bytea". IMO mentioning "strings" is confusing in this context.
>
> I still don't like the fact that the comment is reasoning about
> "terminating NUL bytes" in the context of bytea, but I kept it as is
> for now. It seems confusing to me. Do you think we should rewrite
> this?
+ /*
+ * It's okay that callers can have NUL bytes in bytea because abbreviated
+ * cmp need not make a distinction between terminating NUL bytes, and NUL
+ * bytes representing actual NULs in the authoritative representation.
I mentioned everything starting with "abbreviated cmp need not ..."
was okay, so I agree that "It's okay that callers can have NUL bytes
in bytea" is redundant. Was that the confusing part for you? How about
this:
"Short byteas will have terminating NUL bytes in the abbreviated
datum. Abbreviated comparison need not make a distinction between
thse NUL bytes, and NUL bytes representing actual NULs in the
authoritative representation." [...]
After that, the rest seems to flow better at a quick glance.
> > Likewise, this part of varlena.c is still factually accurate as an aside:
> >
> > - * (Actually, even if there were NUL bytes in the blob it would be
> > - * okay. See remarks on bytea case above.)
> >
> > ...although with the above, we'll have to point to the relevant place
> > in bytea.c.
>
> Not sure if I follow. Previously we agreed that we disallow NUL bytes
> within strings, except for terminating ones. What we are doing here is
> refactoring / simplifying the code. Why keep irrelevant and confusing
> comments?
Okay, I'll concede that's it's no longer useful and maybe a distraction..
--
John Naylor
Amazon Web Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-11-26 07:23:39 | Re: pg_waldump: support decoding of WAL inside tarfile |
| Previous Message | shveta malik | 2025-11-26 07:13:26 | Re: How can end users know the cause of LR slot sync delays? |