|From:||Andres Freund <andres(at)anarazel(dot)de>|
|To:||pgsql-hackers(at)postgresql(dot)org, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Andrew Dunstan <andrew(at)dunslane(dot)net>, David Rowley <dgrowleyml(at)gmail(dot)com>|
|Subject:||PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Mark Callaghan reported a regression in 15, in the post linked here (with
comments in the twitter thread, hence linked here)
A differential flame graph shows increased time spent doing memory
allocations, below ExecInitExpr():
Quite the useful comparison.
That lead to me to look at the size of ExprEvalStep - and indeed, it has
*exploded* in 15. 10-13: 64 bytes, 14: 320 bytes.
The comment above the union for data fields says:
* Inline data for the operation. Inline data is faster to access, but
* also bloats the size of all instructions. The union should be kept to
* no more than 40 bytes on 64-bit systems (so that the entire struct is
* no more than 64 bytes, a single cacheline on common systems).
However, jsonexpr/EEOP_JSONEXPR is 296 bytes, and
hashedscalararrayop/EEOP_HASHED_SCALARARRAYOP is 64 bytes, even though the
limit is 40 bytes.
The EEOP_JSONEXPR stuff was added during 15 development in:
Author: Andrew Dunstan <andrew(at)dunslane(dot)net>
Date: 2022-03-03 13:11:14 -0500
SQL/JSON query functions
the EEOP_HASHED_SCALARARRAYOP stuff was added during 14 development in:
Author: David Rowley <drowley(at)postgresql(dot)org>
Date: 2021-04-08 23:51:22 +1200
Speedup ScalarArrayOpExpr evaluation
Unfortunately ExprEvalStep is public, so I don't think we can fix the 14
regression. If somebody installed updated server packages while the server is
running, we could end up loading extensions referencing ExprEvalStep (at least
plpgsql and LLVMJIT).
It's not great to have an ABI break at this point of the 15 cycle, but I don't
think we have a choice. Exploding the size of ExprEvalStep by ~4x is bad -
both for memory overhead (expressions often have dozens to hundreds of steps)
and expression evaluation performance.
The Hashed SAO case can perhaps be squeezed sufficiently to fit inline, but
clearly that's not going to happen for the json case. So we should just move
that out of line.
Maybe it's worth sticking a StaticAssert() for the struct size somewhere. I'm
a bit wary about that being too noisy, there are some machines with odd
alignment requirements. Perhaps worth restricting the assertion to x86-64 +
armv8 or such?
It very well might be that this isn't the full explanation of the regression
Mark observed. E.g. the difference in DecodeDateTime() looks more likely to be
caused by 591e088dd5b - but we need to fix the above issue, whether it's the
cause of the regression Mark observed or not.
|Next Message||Tom Lane||2022-06-16 23:37:14||Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size|
|Previous Message||Bruce Momjian||2022-06-16 23:24:43||Re: better page-level checksums|