pgsql: Fix incremental JSON parser numeric token reassembly across chun

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix incremental JSON parser numeric token reassembly across chun
Date: 2026-04-10 14:21:48
Message-ID: E1wBCk0-000JOb-0o@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix incremental JSON parser numeric token reassembly across chunks.

When the incremental JSON parser splits a numeric token across chunk
boundaries, it accumulates continuation characters into the partial
token buffer. The accumulator's switch statement unconditionally
accepted '+', '-', '.', 'e', and 'E' as valid numeric continuations
regardless of position, which violated JSON number grammar
(-? int [frac] [exp]). For example, input "4-" fed in single-byte
chunks would accumulate the '-' into the numeric token, producing an
invalid token that later triggered an assertion failure during
re-lexing.

Fix by tracking parser state (seen_dot, seen_exp, prev character)
across the existing partial token and incoming bytes, so that each
character class is accepted only in its grammatically valid position.

Backpatch-through: 17

Branch
------
REL_17_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/2e373785ec07102badee139236ac78c4da4f7c16

Modified Files
--------------
src/common/jsonapi.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 53 insertions(+), 2 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Andrew Dunstan 2026-04-10 14:39:44 pgsql: Fix heap-buffer-overflow in pglz_decompress() on corrupt input.
Previous Message Fujii Masao 2026-04-10 14:01:41 Re: pgsql: Reduce log level of some logical decoding messages from LOG to D