ubsan fails on 32bit builds

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Subject: ubsan fails on 32bit builds
Date: 2022-11-17 01:42:30
Message-ID: 20221117014230.op5kmgypdv2dtqsf@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am working on polishing my patch to make CI use sanitizers. Unfortunately
using -fsanitize=alignment,undefined causes tests to fail on 32bit builds.

https://cirrus-ci.com/task/5092504471601152
https://api.cirrus-ci.com/v1/artifact/task/5092504471601152/testrun/build-32/testrun/recovery/022_crash_temp_files/log/022_crash_temp_files_node_crash.log

../src/backend/storage/lmgr/proc.c:1173:2: runtime error: member access within misaligned address 0xf4019e54 for type 'struct PGPROC', which requires 8 byte alignment
0xf4019e54: note: pointer points here
e0 0d 09 f4 54 9e 01 f4 54 9e 01 f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
==65203==Using libbacktrace symbolizer.
#0 0x57076f46 in ProcSleep ../src/backend/storage/lmgr/proc.c:1173
#1 0x57054cf7 in WaitOnLock ../src/backend/storage/lmgr/lock.c:1859
#2 0x57058e4f in LockAcquireExtended ../src/backend/storage/lmgr/lock.c:1101
#3 0x57058f82 in LockAcquire ../src/backend/storage/lmgr/lock.c:752
#4 0x57051bb8 in XactLockTableWait ../src/backend/storage/lmgr/lmgr.c:702
#5 0x569c31b3 in _bt_doinsert ../src/backend/access/nbtree/nbtinsert.c:225
#6 0x569cff09 in btinsert ../src/backend/access/nbtree/nbtree.c:200
#7 0x569ac19d in index_insert ../src/backend/access/index/indexam.c:193
#8 0x56c72af6 in ExecInsertIndexTuples ../src/backend/executor/execIndexing.c:416
#9 0x56d014c7 in ExecInsert ../src/backend/executor/nodeModifyTable.c:1065
...

I can reproduce this locally.

At first I thought the problem was caused by:
46d6e5f5679 Display the time when the process started waiting for the lock, in pg_locks, take 2

as pg_atomic_uint64 is 8 byte aligned on x86 - otherwise one gets into
terrible terrible performance territory because atomics can be split across
cachelines - but 46d6e5f5679 didn't teach ProcGlobalShmemSize() /
InitProcGlobal() that allocations need to be aligned to a larger
size. However, we've made ShmemAllocRaw() use cacheline alignment, which
should suffice. And indeed - ProcGlobal->allProcs is aligned correctly, and
sizeof(PGPROC) % 8 == 0. It doesn't seem great to rely on that, but ...

Printing out *proc in proc.c:1173 seems indicates clearly that it's not a
valid proc for some reason.

(gdb) p myHeldLocks
$26 = 0
(gdb) p lock->waitProcs
$27 = {links = {prev = 0xf33c4b5c, next = 0xf33c4b5c}, size = 0}
(gdb) p &(waitQueue->links)
$29 = (SHM_QUEUE *) 0xf33c4b5c
(gdb) p proc
$30 = (PGPROC *) 0xf33c4b5c

Afaict the problem is that
proc = (PGPROC *) &(waitQueue->links);

is a gross gross hack - this isn't actually a PGPROC, it's pointing to an
SHM_QUEUE, but *not* one embedded in PGPROC. It kinda works because ->links
is at offset 0 in PGPROC, which means that
SHMQueueInsertBefore(&(proc->links), &(MyProc->links));
will turn &proc->links back into waitQueue->links. Which we then can enqueue
again.

I don't see the point of this hack, even leaving ubsan's valid complaints
aside. Why bother having this, sometimes, fake PGPROC pointer when we could
just use a SHM_QUEUE* to determine the insertion point?

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-11-17 02:33:05 Re: Allow file inclusion in pg_hba and pg_ident files
Previous Message Tomas Vondra 2022-11-17 01:41:14 Re: logical decoding and replication of sequences, take 2