Quick Links

Safer hash table initialization macro

From:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Safer hash table initialization macro
Date:	2025-12-01 13:45:00
Message-ID:	aS2b3LoUypW1/Gdz@ip-10-97-1-34.eu-west-3.compute.internal
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

Currently to create a hash table we do things like:

A) create a struct, say:

typedef struct SeenRelsEntry
{
Oid rel_id;
int list_index;
} SeenRelsEntry;

where the first member is the hash key, and then later:

ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(SeenRelsEntry);
ctl.hcxt = CurrentMemoryContext;

seen_rels = hash_create("find_all_inheritors temporary table",
32, /* start small and extend */
&ctl,

I can see 2 possible issues:

We manually specify the type for keysize, which could become incorrect (from the
start) or if the key member's type changes.

It may be possible to remove the key member without the compiler noticing it.

Take this example and remove:

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 929bb53b620..eb11976afef 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -36,7 +36,6 @@
*/
typedef struct SeenRelsEntry
{
- Oid rel_id; /* relation oid */
int list_index; /* its position in output list(s) */
} SeenRelsEntry;

That would compile without any issues because this rel_id member is not
referenced in the code (for this particular example). That's rare but possible.

But then, on my machine, during make check:

TRAP: failed Assert("!found"), File: "nodeModifyTable.c", Line: 5157, PID: 140430

The reason is that the struct member access is done only for bytes level
operations (within the hash related macros). So it's easy to think that this
member is unused (because it is not referenced in the code).

I'm thinking about what kind of safety we could put in place to better deal with
1) and 2).

What about adding a macro that:

- requests the key member name
- ensures that it is at offset 0
- computes the key size based on the member

Something like:

"
#define HASH_ELEM_INIT(ctl, entrytype, keymember) \
do { \
StaticAssertStmt(offsetof(entrytype, keymember) == 0, \
#keymember " must be first member in " #entrytype); \
(ctl).keysize = sizeof(((entrytype *)0)->keymember); \
(ctl).entrysize = sizeof(entrytype); \
} while (0)
"

That way:

- The key member is explicitly referenced in the code (preventing "unused"
false positives)
- The key size is automatically computed from the actual member type (preventing
type mismatches)
- We enforce that the key is at offset 0

An additional benefit: it avoids repeating the "keysize =" followed by "entrysize ="
in a lot of places in the code (currently about 100 times).

If that sounds like a good idea, I could work on a patch doing so.

Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Responses

Re: Safer hash table initialization macro at 2025-12-01 14:44:41 from Jelte Fennema-Nio

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Philipp Marek	2025-12-01 13:55:27	Re: [PATCH] Better Performance for PostgreSQL with large INSERTs
Previous Message	Pavel Stehule	2025-12-01 13:40:08	Re: Migrate to autoconf 2.72?