From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Sami Imseih <samimseih(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PgStat_HashKey padding issue when passed by reference |
Date: | 2025-09-04 17:07:15 |
Message-ID: | CAEudQAqGrwdS6TZuPidxu09-sGVBbfocTOSdjp4sfp-=-q79_Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Em qui., 4 de set. de 2025 às 13:15, Sami Imseih <samimseih(at)gmail(dot)com>
escreveu:
> Hi,
>
> I was experimenting with adding a new OID field to PgStat_HashKey:
>
> ```
> typedef struct PgStat_HashKey
> {
> PgStat_Kind kind; /* statistics entry kind */
> Oid dboid; /* database ID. InvalidOid for shared objects. */
> Oid newfield;
> uint64 objid; /* object ID (table, function, etc.), or
> * identifier. */
> } PgStat_HashKey;
> ```
>
> It's important to observe that hole padding is added with this new
> field:
>
> ```
> (gdb) ptype /o PgStat_HashKey
> type = struct PgStat_HashKey {
> /* 0 | 4 */ uint32 kind;
> /* 4 | 4 */ Oid dboid;
> /* 8 | 4 */ Oid newfield;
> /* XXX 4-byte hole */
> /* 16 | 8 */ uint64 objid;
>
> /* total size (bytes): 24 */
> }
> ```
>
> With "newfield" added, I started seeing the error "entry ref vanished
> before deletion" when creating the cluster (or selecting from
> pg_stat_database, etc.).
>
> This error occurs when the local reference to a pgstat entry is deleted
> and the entry can't be found in the simplehash pgStatEntryRefHash,
> based on the key.
>
> ```
> if (!pgstat_entry_ref_hash_delete(pgStatEntryRefHash, key))
> elog(ERROR, "entry ref vanished before deletion");
> ```
>
> This is surprising because this entry with the key is created earlier
> via pgstat_get_entry_ref_cached -> pgstat_entry_ref_hash_insert.
>
> When I brought this up to Bertrand (CC'd), he was not able to reproduce the
> ERROR by adding a new field, and we quickly realized that it is related
> to the compiler optimization (-O) flags we were using. He was building
> with -O0 (no optimization) and I was building with -O2. So something
> with optimization was causing the issue.
>
> Starting a problematic cluster with Valgrind:
>
> ```
> valgrind --leak-check=full --track-origins=yes \
> $PGHOME/bin/postgres -D $PGDATA --single -P
> ```
>
> shows messages like "Use of uninitialised value of size 8":
>
> ```
> ==26625== Use of uninitialised value of size 8
> ==26625== at 0x61D04F: pgstat_get_entry_ref_cached (pgstat_shmem.c:405)
> ==26625== by 0x61D04F: pgstat_get_entry_ref (pgstat_shmem.c:488)
> ==26625== by 0x6177B9: pgstat_fetch_entry (pgstat.c:977)
> ==26625== by 0x5433B2: rebuild_database_list (autovacuum.c:1002)
> ```
>
> Further debugging led us to the fact that the key is passed-by-value to
>
> ```
> pgstat_get_entry_ref_cached(PgStat_HashKey key, PgStat_EntryRef
> **entry_ref_p)
> ```
>
> and in that routine the entry is created:
>
> ```
> cache_entry = pgstat_entry_ref_hash_insert(pgStatEntryRefHash, key,
> &found);
> ``
>
> The hash for the key is created with fasthash32(key, size, 0) and
> compared against another key using memcmp(a, b, sizeof(PgStat_HashKey))
> in the "pgstat_hash_hash_key" and "pgstat_cmp_hash_key" routines,
> respectively.
>
> With -O2, we observed that (on our testing environment):
>
> pgstat_get_entry_ref_cached() is inlined
> the inlined code does not copy the hole padding content when the key
> is passed-by-value
>
> So that while the hole initially contains zeroes:
>
> ```
> /* clear padding */
> memset(&key, 0, sizeof(struct PgStat_HashKey));
> ```
>
> We observed that the zeroes in the hole are not copied to
> pgstat_get_entry_ref_cached()
> when passed-by-value with -O2.
>
> GDB when setting a breakpoint at pgstat_entry_ref_hash_insert (which
> is called inside
> pgstat_get_entry_ref_cached) shows that the padding has some unexpected
> values, see the 12-15th byte "0x00 0x23 0x21 0x21" which based on the
> memset done earlier should be 0.
>
> ```
> (gdb) x/24xb d
> 0x7fff68cc16f0: 0x02 0x00 0x00 0x00 0x05 0x00 0x00 0x00
> 0x7fff68cc16f8: 0x00 0x00 0x00 0x00 0x00 0x23 0x21 0x21
> 0x7fff68cc1700: 0x67 0x0a 0x00 0x00 0x00 0x00 0x00 0x00
> (gdb)
> ```
>
> Now, if we pass the key by reference to pgstat_entry_ref_hash_insert as is
> patched, we can see the padding is in fact cleared.
>
> ```
> (gdb) x/24xb d
> 0x7ffe1bf6c640: 0x01 0x00 0x00 0x00 0x05 0x00 0x00 0x00
> 0x7ffe1bf6c648: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
> 0x7ffe1bf6c650: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
> ```
>
> This looks like a compiler bug, we tested multiple ways to workaround:
>
> 1/ pg_noinline pgstat_get_entry_ref_cached
> 2/ making volatile the PgStat_HashKey key in the
> pgstat_get_entry_ref_cached parameter
> 3/ Passing the key by reference instead of by value
> 4/ set SH_SCOPE to "external" rather than "static inline" in pgstat_shmem.c
>
> it seems, at least for this case, the best option is to pass the key
> by reference, like
> in the attached patch.
>
+1
Not to mention that it is faster to pass the parameter by reference.
best regards,
Ranier Vilela
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Burd | 2025-09-04 17:24:00 | Re: [PATCH] Let's get rid of the freelist and the buffer_strategy_lock |
Previous Message | Nikolay Shaplov | 2025-09-04 17:07:00 | Re: [PATCH] ternary reloption type |