| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Likely undefined behavior with some flexible arrays |
| Date: | 2026-01-21 21:49:39 |
| Message-ID: | axxbw7ku4w24hwe6uvkm4vdufo4zr4k5344m6naahkpsoafag5@llovwgptzbgq |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
I got a - I thought - spurious warning in a development patch. A simplified
reproducer of the warning is [1], which triggers:
<source>: In function 'trigger_warning':
<source>:19:9: warning: array subscript 'struct foo[0]' is partly outside array bounds of 'unsigned char[13]' [-Warray-bounds=]
19 | foop->len = len;
| ^~
<source>:18:12: note: object of size 13 allocated by 'allocme'
18 | foop = allocme(offsetof(struct foo, data) + sizeof(char) * len);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiler returned: 0
This is a pattern that we have in quite a few places.
Note that the warning vanishes if the "flag" element is not present. The
reason for that difference is that when the flag element is present, sizeof(
struct foo) is 16 on a 64bit platform, but we allocate only 13 bytes, as
offsetof(struct foo, data) is 12.
I asked the gcc folks whether they think the warning is spurious or whether
the code is broken. They think it's the code. Their argument is that it always
has to be legal to assign a struct foo * to a struct foo (i.e. do struct foo a
= *foop), and that that may access the trailing padding bytes.
Unfortunately, after staring at the C99 draft, I think the gcc folks are
right, and this undefined behavior. Although it's really hard to tell with the
C99 formulation.
I also looked at C23, and the examples (e.g. § 6.7.3.2, paragraph 28) do lend
some additional credence to the position of the gcc folks.
Which isn't great, given how widely we use offsetof(structtype, flex_member)
as the base allocation size.
The historic reason we do that is support for systems that didn't support real
flexible arrays, where we used one element arrays instead. In that case
sizeof(structtype) would obviously return too big a value.
I don't know how likely doing the wrong thing here is to lead to wrong code
generation, but relying on that not to happen doesn't seem great.
I've attached a list of structs that have a flexible array and padding,
generated with
pahole -a --padding_ge=1 --with_flexible_array src/backend/postgres contrib/*/*.so
which of course is not exhaustive. And will differ between platforms.
One example of where we do this is gtrgm_alloc().
It's not entirely obvious to me what to do here. I think the minimum ought to
to not introduce more uses of offsetof() for flexible-array allocation sizing
in the future. But just leaving the existing uses as-is seems a bit
scary. Fixign them all seems like a lot of fiddly work with some compatibility
hazards :(
Greetings,
Andres Freund
PS: I'm fairly sure that we also are entering UB territory when allocating
space for just the smallest member of a union, as we do a *lot* in numeric.c,
due to all the allocations using NUMERIC_HDRSZ_SHORT. But there's a lot more
standardese to parse to figure out the requirements.
[1]
See also https://godbolt.org/z/98jchG4Ex
/* compiled with gcc -O2 -Warray-bounds */
#include <stddef.h>
struct foo
{
size_t len;
/* warning is only emitted if the "flag" struct member is present */
int flag;
char data[];
};
extern void *allocme(size_t sz) __attribute__((alloc_size(1)));
struct foo* trigger_warning()
{
size_t len = 1;
struct foo *foop;
foop = allocme(offsetof(struct foo, data) + sizeof(char) * len);
foop->len = len;
return foop;
}
| Attachment | Content-Type | Size |
|---|---|---|
| structs-with-padding.txt | text/plain | 15.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-01-21 22:07:04 | Re: Likely undefined behavior with some flexible arrays |
| Previous Message | Robert Haas | 2026-01-21 21:16:26 | Re: pgsql: tests: Add a test C++ extension module |