| From: | Álvaro Herrera <alvherre(at)kurilemu(dot)de> |
|---|---|
| To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
| Cc: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, John Naylor <johncnaylorls(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: More speedups for tuple deformation |
| Date: | 2026-02-25 09:48:58 |
| Message-ID: | 202602250812.ajdqiuxz2dox@alvherre.pgsql |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
With all these patches applied, I get this from pahole for
CompactAttribute,
struct CompactAttribute {
int16 attcacheoff; /* 0 2 */
int16 attlen; /* 2 2 */
_Bool attbyval; /* 4 1 */
uint8 attalignby; /* 5 1 */
_Bool attispackable:1; /* 6: 0 1 */
_Bool atthasmissing:1; /* 6: 1 1 */
_Bool attisdropped:1; /* 6: 2 1 */
_Bool attgenerated:1; /* 6: 3 1 */
/* XXX 4 bits hole, try to pack */
char attnullability; /* 7 1 */
/* size: 8, cachelines: 1, members: 9 */
/* sum members: 7 */
/* sum bitfield members: 4 bits, bit holes: 1, sum bit holes: 4 bits */
/* last cacheline: 8 bytes */
};
The 'attnullability' thing stands out. It probably doesn't make much of
a difference -- considering that it's down to 8 bytes already and I
doubt anything will change if it's instead 6 -- but given these
definitions
/* Valid values for CompactAttribute->attnullability */
#define ATTNULLABLE_UNRESTRICTED 'f' /* No constraint exists */
#define ATTNULLABLE_UNKNOWN 'u' /* constraint exists, validity unknown */
#define ATTNULLABLE_VALID 'v' /* valid constraint exists */
#define ATTNULLABLE_INVALID 'i' /* constraint exists, marked invalid */
it looks like you could reduce that to 3 bits with flags that say "does
a constraint exist", "is the constraint validity known", and "is it
valid or invalid", respectively. (Or you could use just 2 bits and
encode the four states there, but that's inconsequential).
But as I said, I'd bet this doesn't make any performance difference.
Looking at populate_isnull_array(), it says caller must ensure *isnull
is of a certain size. However, looking at its caller, it's not clear to
me how have we ensure that this happened. In MakeTupleTableSlot() we do
MAXALIGN() of natts, which I think should make a reference to
populate_isnull_array() so that we know *why* we MAXALIGN() there; and
ExecSetSlotDescriptor() the MAXALIGN() looks very mysterious compared to
the non-maxaligned size used to allocate tts_values in the line just
above. I just think, there should be a comment in those places. (And
anywhere tts_isnull is allocated; didn't find other places, but didn't
look too hard either).
Regarding deform_bench, I wonder if we shouldn't start a proper
benchmarking suite (of which this would the first participant) by
putting this in src/benchmark/tuple_deform instead (or something like
that) instead of relegating it to src/test/modules. It doesn't make
much of a technical difference, but it is a bit of a statement that more
tools are welcome.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"People get annoyed when you try to debug them." (Larry Wall)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lukas Fittl | 2026-02-25 10:00:35 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Chao Li | 2026-02-25 09:15:16 | Re: Add errdetail() with PID and UID about source of termination signal |