| From: | ChenhuiMo <chenhuimo(dot)mch(at)qq(dot)com> |
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | [PATCH] Make NumericVar storage semantics explicit |
| Date: | 2026-05-10 14:12:15 |
| Message-ID: | tencent_516F5299BE499E4B299CB7BF4C37D6A22B09@qq.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi all,
This patch aims to make the memory and ownership semantics of NumericVar
explicit in PostgreSQL, and to reorganize the related helper functions
accordingly, so that the rules for digit-buffer acquisition, release,
borrowing, and writing become clearer, more consistent, and easier
to maintain.
This patch does not change the mathematical algorithms used by numeric
itself. It focuses on the memory model, state transitions, and ownership
semantics of NumericVar and the related helper paths, and adjusts those
paths as needed so that they match the new model.
This patch is also not a performance patch. It does not introduce
aggressive performance optimizations, nor does it attempt to rewrite
all higher-level numeric routines in this round. Instead, its focus is to:
- make previously scattered implicit contracts explicit;
- make the storage / borrow / own semantics of NumericVar easier to
understand and maintain;
- actually connect this new model to the core comparison, hashing,
and basic arithmetic paths;
- verify through regression tests that SQL-visible behavior remains unchanged.
Although this patch is not aimed at improving performance, a clearer
and more explicit memory model can provide a more solid foundation for
future internal numeric optimizations. The related tests also indicate
that this refactoring does not introduce any visible performance regression.
Background and motivation
-------------------------
NumericVar is the in-memory working representation of PostgreSQL's numeric
type. It is not the packed Numeric datum stored in the database itself, but
the working representation used by internal paths such as comparison,
arithmetic, and aggregation.
In the current implementation, there are already several different memory
semantics around NumericVar, for example:
- a variable may be in an "empty" state, with no active digit storage;
- a variable may borrow digits from an external Numeric datum as a read-only view;
- a variable may own a writable digit buffer for internal arithmetic operations;
- in some paths, digit storage comes from a fixed-size inline buffer;
- in some paths, digit storage comes from heap allocation.
These conventions already exist in the code, but they remain largely
implicit rather than being expressed as a single, explicit internal
contract. This makes ownership, mutability, and buffer-transition rules
harder to reason about, especially when switching between packed Numeric
datums and the working representation.
Accordingly, the goal of this patch is not to introduce new numeric
algorithms, but to first make these already-existing but insufficiently
documented memory semantics explicit and consistent.
The new memory model
--------------------
This patch introduces a more explicit NumericVar storage-state model to
describe the meaning of the current digit storage of a NumericVar.
In addition to the existing numeric metadata, NumericVar now has a few
state-related fields, including:
capacity: the physical capacity of the current owned digit buffer;
state: the current storage state of the NumericVar;
borrow_kind: in borrowed state, which kind of storage the current digits
are borrowed from;
inline_buf: whether the variable is associated with a fixed-size inline home.
This patch also introduces BufferedNumericVar, which wraps a NumericVar
together with a fixed-size inline buffer, used for:
inline writable digit storage;
or, when needed, as workspace to hold a detoasted numeric datum.
It is important to emphasize that BufferedNumericVar is not a new SQL-visible
representation. It is only an internal helper structure used by numeric to
support the new memory model.
The new storage states are:
NUMVAR_EMPTY
NUMVAR_BORROWED
NUMVAR_OWNED_INLINE
NUMVAR_OWNED_HEAP
For borrowed state, this patch further distinguishes:
NUMVAR_BORROW_EXTERNAL
NUMVAR_BORROW_INLINE
This makes it possible to distinguish more clearly between:
borrowing an external Numeric datum or static constant representation;
borrowing a detoasted numeric representation currently residing in a
BufferedNumericVar inline buffer.
Core helpers and interface semantics
Under the new model, the related helpers are reorganized into clearer layers:
- raw storage helpers:
only responsible for low-level raw digit-storage allocation;
- storage-state helpers:
responsible for transitions among borrowed / owned / inline / heap states;
- variable-level helpers:
responsible for NumericVar initialization, copying,
buffer acquisition, and release;
- result preparation / finalization helpers:
responsible for alias-safe result preparation and commit
during arithmetic operations.
The purpose of this split is to make the already-existing but previously
scattered rules clearer, not to introduce new SQL-visible behavior.
Core paths already using the new model
--------------------------------------
This patch does not merely introduce unused infrastructure.
The new model is already used in core numeric execution paths.
The SQL-visible core entry points already using the new model include:
cmp_numerics
numeric_hash
numeric_add
numeric_sub
numeric_mul
numeric_div
This means that:
comparison paths already use the explicit state / borrowed / owned semantics;
hashing paths already interact with the new initialization / borrow semantics;
the basic arithmetic paths already interact with the new state model
through the new result-preparation / result-finalization helpers.
The lower-level variable arithmetic bodies such as
add_var/sub_var/mul_var/div_var are not completely
rewritten into a fully new style in this patch, but their
result preparation, alias-safe behavior, and result finalization
are already integrated with the explicit state model through the
new helper framework.
More complex higher-level numeric routines such as sqrt, ln, log,
and power still largely keep their original algorithmic organization
in this patch, and are only adjusted as needed to remain compatible
with the new helper semantics. This is an intentional choice: this
patch focuses first on making the underlying NumericVar contract
explicit and on integrating the new model into core paths, rather
than trying to rewrite all higher-level numeric routines at once.
Testing
-------
To verify that this refactoring does not change SQL-visible behavior,
this patch adds and reorganizes a fairly broad set of regression tests.
The tests cover multiple precision / scale combinations, including:
NUMERIC(9, 2)
NUMERIC(18, 2)
NUMERIC(38, 2)
NUMERIC(80, 16)
NUMERIC(175, 35)
NUMERIC(400, 80)
NUMERIC(1000, 200)
NUMERIC(1000, 800)
NUMERIC(1000, 995)
NUMERIC(1000, 1000)
The test coverage includes:
basic arithmetic operations;
composite expressions;
comparison-sensitive paths;
window functions;
join-driven expressions;
aggregate expressions;
post-update verification;
error cases;
index / index-only related paths.
For excessively long outputs, digest-based verification is used so
that the expected files are not dominated by very large numeric literals.
In addition to the new and adjusted numeric-related tests, the
patch also passes the full regression test suite, confirming that:
the new state model and helper refactoring do not break existing behavior;
higher-level numeric paths still work correctly after being adapted
to the new helper semantics.
I would especially appreciate feedback on:
- whether this storage-state model makes the existing NumericVar
contract clearer;
- whether this helper split is sensible;
- whether the borrowed versus owned terminology is clear enough;
- whether the amount of detail in the updated comments and invariants
is appropriate.
Thanks for reading.
This series is split into two patches:
make NumericVar storage states explicit and connect
the new model to core numeric paths add regression coverage
I would especially appreciate feedback on the state model,
helper split, and terminology.
Regards,
Chenhui Mo
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-numeric-make-NumericVar-storage-states-explicit-and-.patch | application/octet-stream | 56.2 KB |
| 0002-numeric-add-regression-coverage-for-NumericVar-state.patch | application/octet-stream | 132.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-05-10 14:23:26 | Re: COPY FROM with RLS |
| Previous Message | solaimurugan vellaipandiyan | 2026-05-10 13:59:53 | Re: Review - Patch for pg_bsd_indent: improve formatting of multiline comments |