Re: Safer hash table initialization macro

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Safer hash table initialization macro
Date: 2026-01-27 07:26:48
Message-ID: 2903820E-A053-4FF8-BEB7-DCECFBB6A21A@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jan 26, 2026, at 18:50, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
>
> On Mon Jan 26, 2026 at 11:26 AM CET, Bertrand Drouvot wrote:
>> I can still see it. If I apply from 0001 to 0004 and compile, I see it. It looks
>> like it's fixed in 0005:
>
> Ugh, you're right. I had ammended that fix to the wrong commit. Fixed
> now.
> <v9-0001-Add-hash_make-macros.patch><v9-0002-Add-foreach_hash-macro.patch><v9-0003-Use-hash_make-macros-throughout-the-codebase.patch><v9-0004-Use-foreach_hash-macro-throughout-the-codebase.patch><v9-0005-Inline-functions-that-have-now-become-trivial.patch>

Hi Jelte,

A solid patch. Looks like 0004 is conflict to master branch (3fccbd94cba), thus needs a rebase. Anyway, I hard reset the an early commit and applied the patch set locally.

Here are a few comments for 0001 and 0002:

1 - 0001
```
void
+hash_opts_init(HASHCTL *ctl, int *flags,
+ Size keysize, Size entrysize, bool string_key,
+ const HASHOPTS *opts)
+{
+ MemSet(ctl, 0, sizeof(*ctl));
+ ctl->keysize = keysize;
+ ctl->entrysize = entrysize;
+
+ *flags = HASH_ELEM;
+
+ if (opts != NULL && opts->hash != NULL)
+ {
+ /* force_blobs only affects default hash selection, not custom hash */
+ Assert(!opts->force_blobs);
+ ctl->hash = opts->hash;
+ *flags |= HASH_FUNCTION;
+ }
+ else if (opts != NULL && opts->force_blobs)
+ {
+ *flags |= HASH_BLOBS;
+ }
+ else
+ {
+ *flags |= string_key ? HASH_STRINGS : HASH_BLOBS;
+ }
+
+ if (opts != NULL && opts->match != NULL)
+ {
+ ctl->match = opts->match;
+ *flags |= HASH_COMPARE;
+ }
```

This function has a lot of duplicate checks on opts!=NULL, I think it can be simplified as:
```
*flags = HASH_ELEM;

if (opts == NULL)
{
*flags |= string_key ? HASH_STRINGS : HASH_BLOBS;
return;
}

if (opts->hash != NULL)
{
/* force_blobs only affects default hash selection */
Assert(!opts->force_blobs);
ctl->hash = opts->hash;
*flags |= HASH_FUNCTION;
}
else if (opts->force_blobs)
{
*flags |= HASH_BLOBS;
}
….
```

2 - 0002
```
+HASH_SEQ_STATUS
+hash_seq_new(HTAB *hashp)
+{
+ HASH_SEQ_STATUS status;
+
+ hash_seq_init(&status, hashp);
+ return status;
+}
```

Why this function returns a structure by value? Which looks quite uncommon. Usually, when a function is named with “new”, it returns a pointer to a new object.

3 - 0002

foreach_hash feels fragile. It requires to call foreach_hash_term before break, which is easy to forget. And the documentation doesn’t mention how to continue, how to return from a loop, and how to goto from inside a loop.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-01-27 07:28:40 Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions
Previous Message Andrey Borodin 2026-01-27 07:26:01 Re: Fix gistkillitems & add regression test to microvacuum