Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-04-07 19:07:31
Message-ID: 20240407190731.izm3mdazednrsiqk@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-01 11:53:28 +0900, Masahiko Sawada wrote:
> On Fri, Mar 29, 2024 at 4:21 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > I've marked it Ready for Committer.
>
> Thank you! I've attached the patch that I'm going to push tomorrow.

Locally I ran a 32bit build with ubsan enabled (by accident actually), which
complains:

performing post-bootstrap initialization ...
----------------------------------- stderr -----------------------------------
../../../../../home/andres/src/postgresql/src/backend/access/common/tidstore.c:341:24: runtime error: member access within misaligned address 0xffb6258e for type 'struct BlocktableEntry', which requires 4 byte alignment
0xffb6258e: note: pointer points here
00 00 02 00 01 40 dc e9 83 0b 80 48 70 ee 00 00 00 00 00 00 00 01 17 00 00 00 f8 d4 a6 ee e8 25
^
#0 0x814097e in TidStoreSetBlockOffsets ../../../../../home/andres/src/postgresql/src/backend/access/common/tidstore.c:341
#1 0x826560a in dead_items_add ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:2889
#2 0x825f8da in lazy_scan_prune ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:1502
#3 0x825da71 in lazy_scan_heap ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:977
#4 0x825ad8f in heap_vacuum_rel ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:499
#5 0x8697e97 in table_relation_vacuum ../../../../../home/andres/src/postgresql/src/include/access/tableam.h:1725
#6 0x869fca6 in vacuum_rel ../../../../../home/andres/src/postgresql/src/backend/commands/vacuum.c:2206
#7 0x869a0fd in vacuum ../../../../../home/andres/src/postgresql/src/backend/commands/vacuum.c:622
#8 0x869986b in ExecVacuum ../../../../../home/andres/src/postgresql/src/backend/commands/vacuum.c:449
#9 0x8e5f832 in standard_ProcessUtility ../../../../../home/andres/src/postgresql/src/backend/tcop/utility.c:859
#10 0x8e5e5f6 in ProcessUtility ../../../../../home/andres/src/postgresql/src/backend/tcop/utility.c:523
#11 0x8e5b71a in PortalRunUtility ../../../../../home/andres/src/postgresql/src/backend/tcop/pquery.c:1158
#12 0x8e5be80 in PortalRunMulti ../../../../../home/andres/src/postgresql/src/backend/tcop/pquery.c:1315
#13 0x8e59f9b in PortalRun ../../../../../home/andres/src/postgresql/src/backend/tcop/pquery.c:791
#14 0x8e4d5f3 in exec_simple_query ../../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:1274
#15 0x8e55159 in PostgresMain ../../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4680
#16 0x8e54445 in PostgresSingleUserMain ../../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4136
#17 0x88bb55e in main ../../../../../home/andres/src/postgresql/src/backend/main/main.c:194
#18 0xf76f47c4 (/lib/i386-linux-gnu/libc.so.6+0x237c4) (BuildId: fe79efe6681a919714a4e119da2baac3a4953fbf)
#19 0xf76f4887 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x23887) (BuildId: fe79efe6681a919714a4e119da2baac3a4953fbf)
#20 0x80d40f7 in _start (/srv/dev/build/postgres/m-dev-assert-32/tmp_install/srv/dev/install/postgres/m-dev-assert-32/bin/postgres+0x80d40f7)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../../../home/andres/src/postgresql/src/backend/access/common/tidstore.c:341:24 in
Aborted (core dumped)
child process exited with exit code 134
initdb: data directory "/srv/dev/build/postgres/m-dev-assert-32/tmp_install/initdb-template" not removed at user's request

At first I was confused why CI didn't find this. Turns out that, for me, this
is only triggered without compiler optimizations, and I had used -O0 while CI
uses some optimizations.

Backtrace:
#9 0x0814097f in TidStoreSetBlockOffsets (ts=0xb8dfde4, blkno=15, offsets=0xffb6275c, num_offsets=11)
at ../../../../../home/andres/src/postgresql/src/backend/access/common/tidstore.c:341
#10 0x0826560b in dead_items_add (vacrel=0xb8df6d4, blkno=15, offsets=0xffb6275c, num_offsets=11)
at ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:2889
#11 0x0825f8db in lazy_scan_prune (vacrel=0xb8df6d4, buf=24, blkno=15, page=0xeeb6c000 "", vmbuffer=729, all_visible_according_to_vm=false,
has_lpdead_items=0xffb62a1f) at ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:1502
#12 0x0825da72 in lazy_scan_heap (vacrel=0xb8df6d4) at ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:977
#13 0x0825ad90 in heap_vacuum_rel (rel=0xb872810, params=0xffb62e90, bstrategy=0xb99d5e0)
at ../../../../../home/andres/src/postgresql/src/backend/access/heap/vacuumlazy.c:499
#14 0x08697e98 in table_relation_vacuum (rel=0xb872810, params=0xffb62e90, bstrategy=0xb99d5e0)
at ../../../../../home/andres/src/postgresql/src/include/access/tableam.h:1725
#15 0x0869fca7 in vacuum_rel (relid=1249, relation=0x0, params=0xffb62e90, bstrategy=0xb99d5e0)
at ../../../../../home/andres/src/postgresql/src/backend/commands/vacuum.c:2206
#16 0x0869a0fe in vacuum (relations=0xb99de08, params=0xffb62e90, bstrategy=0xb99d5e0, vac_context=0xb99d550, isTopLevel=true)

(gdb) p/x page
$1 = 0xffb6258e

I think compiler optimizations are only tangentially involved here, they
trigger the stack frame layout to change, e.g. because some variable will just
exist in a register.

Looking at the code, the failure isn't suprising anymore:
char data[MaxBlocktableEntrySize];
BlocktableEntry *page = (BlocktableEntry *) data;

'char' doesn't enforce any alignment, but you're storing a BlocktableEntry in
a char[]. You can't just do that. Look at how we do that for
e.g. PGAlignedblock.

With the attached minimal fix, the tests pass again.

Greetings,

Andres Freund

Attachment Content-Type Size
tidstore-quickfix-alignment.diff text/x-diff 658 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-04-07 19:13:12 Re: MultiXact\SLRU buffers configuration
Previous Message Alexander Lakhin 2024-04-07 19:00:00 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands