| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> |
| Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Safer hash table initialization macro |
| Date: | 2026-01-27 07:26:48 |
| Message-ID: | 2903820E-A053-4FF8-BEB7-DCECFBB6A21A@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Jan 26, 2026, at 18:50, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
>
> On Mon Jan 26, 2026 at 11:26 AM CET, Bertrand Drouvot wrote:
>> I can still see it. If I apply from 0001 to 0004 and compile, I see it. It looks
>> like it's fixed in 0005:
>
> Ugh, you're right. I had ammended that fix to the wrong commit. Fixed
> now.
> <v9-0001-Add-hash_make-macros.patch><v9-0002-Add-foreach_hash-macro.patch><v9-0003-Use-hash_make-macros-throughout-the-codebase.patch><v9-0004-Use-foreach_hash-macro-throughout-the-codebase.patch><v9-0005-Inline-functions-that-have-now-become-trivial.patch>
Hi Jelte,
A solid patch. Looks like 0004 is conflict to master branch (3fccbd94cba), thus needs a rebase. Anyway, I hard reset the an early commit and applied the patch set locally.
Here are a few comments for 0001 and 0002:
1 - 0001
```
void
+hash_opts_init(HASHCTL *ctl, int *flags,
+ Size keysize, Size entrysize, bool string_key,
+ const HASHOPTS *opts)
+{
+ MemSet(ctl, 0, sizeof(*ctl));
+ ctl->keysize = keysize;
+ ctl->entrysize = entrysize;
+
+ *flags = HASH_ELEM;
+
+ if (opts != NULL && opts->hash != NULL)
+ {
+ /* force_blobs only affects default hash selection, not custom hash */
+ Assert(!opts->force_blobs);
+ ctl->hash = opts->hash;
+ *flags |= HASH_FUNCTION;
+ }
+ else if (opts != NULL && opts->force_blobs)
+ {
+ *flags |= HASH_BLOBS;
+ }
+ else
+ {
+ *flags |= string_key ? HASH_STRINGS : HASH_BLOBS;
+ }
+
+ if (opts != NULL && opts->match != NULL)
+ {
+ ctl->match = opts->match;
+ *flags |= HASH_COMPARE;
+ }
```
This function has a lot of duplicate checks on opts!=NULL, I think it can be simplified as:
```
*flags = HASH_ELEM;
if (opts == NULL)
{
*flags |= string_key ? HASH_STRINGS : HASH_BLOBS;
return;
}
if (opts->hash != NULL)
{
/* force_blobs only affects default hash selection */
Assert(!opts->force_blobs);
ctl->hash = opts->hash;
*flags |= HASH_FUNCTION;
}
else if (opts->force_blobs)
{
*flags |= HASH_BLOBS;
}
….
```
2 - 0002
```
+HASH_SEQ_STATUS
+hash_seq_new(HTAB *hashp)
+{
+ HASH_SEQ_STATUS status;
+
+ hash_seq_init(&status, hashp);
+ return status;
+}
```
Why this function returns a structure by value? Which looks quite uncommon. Usually, when a function is named with “new”, it returns a pointer to a new object.
3 - 0002
foreach_hash feels fragile. It requires to call foreach_hash_term before break, which is easy to forget. And the documentation doesn’t mention how to continue, how to return from a loop, and how to goto from inside a loop.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-01-27 07:28:40 | Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions |
| Previous Message | Andrey Borodin | 2026-01-27 07:26:01 | Re: Fix gistkillitems & add regression test to microvacuum |