pg_stash_advice: dump file with overlong stash name crashes worker in a restart loop

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: pg_stash_advice: dump file with overlong stash name crashes worker in a restart loop
Date: 2026-05-17 06:24:52
Message-ID: CAJTYsWWYhcEx1YqC=B331-Df9EpD8MxzwswWL0okz9LLCUUpBA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While testing pg_stash_advice, I came across a
crash that's reachable through the persistence dump file:

If $PGDATA/pg_stash_advice.tsv contains a stash name of 64 bytes or more
(i.e. >= NAMEDATALEN), the worker hits an assertion in dshash_strhash()
on startup. The postmaster treats the SIGABRT as a crash, performs full
crash recovery, restarts the worker, and the worker hits the same
assertion again. The cluster never reaches a stable state until the
file is removed by hand.

SQL-level callers cannot reach this because pgsa_check_stash_name()
rejects long names. The persistence parser, however, accepts the name
straight from the file and feeds it into dshash_find_or_insert(), which
in turn calls dshash_strhash() configured with key_size = NAMEDATALEN.
That function asserts strlen(v) < size. Strictly speaking the file
parser was missing a length check that the SQL path already performs.

Minimal reproduction
====================

$ pg_ctl -D $PGDATA stop -m fast
$ python3 -c "print('stash\t' + 'a'*64)" > $PGDATA/pg_stash_advice.tsv
$ pg_ctl -D $PGDATA -l logfile start

Server log (excerpt, every ~1-2 seconds):

DEBUG: pg_stash_advice worker started
DEBUG: restoring advice stash "aaaa...aaaa"
TRAP: failed Assert("strlen((const char *) v) < size"),
File: "dshash.c", Line: 634
postgres: pg_stash_advice worker (ExceptionalCondition+0xbb)
postgres: pg_stash_advice worker (dshash_strhash+0x48)
postgres: pg_stash_advice worker (dshash_find_or_insert_extended+0x2e)
pg_stash_advice.so (pgsa_create_stash)
pg_stash_advice.so (pgsa_restore_stashes)
pg_stash_advice.so (pgsa_read_from_disk)
pg_stash_advice.so (pg_stash_advice_worker_main+0x1aa)
...
LOG: background worker "pg_stash_advice worker" was terminated by
signal 6: Aborted
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
[worker comes back up, asserts again, ad infinitum]

In my testing this produced ~18 crashes in ~12 seconds before I removed
the file. Threshold is exact: 63 bytes is fine, 64 bytes crashes.

In a production (non-assertion) build the name would instead be silently
truncated to NAMEDATALEN-1 bytes by the dshash key-copy path, allowing
distinct long names that share a 63-byte prefix to collide in the stash
table.

The attached patch validates the parsed stash-name length in the same
place where the parser already raises ERRCODE_DATA_CORRUPTED on other
forms of syntax error, mirroring the existing length check in
pgsa_check_stash_name(). Not sure if TAP test too is needed for this, but
I've included one in the patch.

Regards,
Ayush

Attachment Content-Type Size
0001-pg_stash_advice-reject-overlong-stash-names-when-loa.patch application/octet-stream 4.1 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Hüseyin Demir 2026-05-17 09:10:36 Re: log_checkpoints: count WAL segment creations from all processes
Previous Message Andrei Lepikhov 2026-05-17 06:03:15 Re: Sequence Access Methods, round two