Re: More speedups for tuple deformation

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: More speedups for tuple deformation
Date: 2026-02-24 14:39:16
Message-ID: rbxc2qqhsvzxpukgd36caoa4ydgn5r22fxktxanrkn6nobg7j6@27b4vogohgu2
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-24 15:23:17 +1300, David Rowley wrote:
> The changes in 0004 and 0005 are new. 0004 makes calling
> slot_getmissingattrs() the responsibility of the
> TupleTableSlotOps.getsomeattrs() function. Doing this allows
> getsomeattrs() to be called with the sibling call optimisation in
> slot_getsomeattrs_int() and since slot_getsomeattrs_int() is such a
> trivial function now, I ended up just modifying slot_getsomeattrs() to
> call getsomeattrs() in a way that allows the compiler to apply the
> sibling call optimisation. This seems to help reduce some overheads
> and makes the 0 extra column tests look better.

ISTM we should just merge 0004. In my testing it's a very clear win, without,
afaict, any downsides.

> 0005 reduces the size of CompactAttribute. It shrinks the struct down
> to 8 bytes from 16 by using some bitflags for some lesser-used
> booleans and by shrinking attcacheoff down to int16. The idea is that
> we just don't cache any offsets larger than 2^15. It's likely if we
> get a tuple that big that there's a variable-length attribute anyway,
> which caching the offset of isn't possible.
>
> I'm not getting great results from benchmarking the 0005 patch. I
> verified that gcc does access the array without calculating the
> element address from scratch each time and calculates it once, then
> increments the pointer by sizeof(CompactAttribute). See the attached
> .csv for the results on the 3 machines I tested on.

FWIW, where I had seen that be rather beneficial is the TupleDescCompactAttr()
at the start of the various loops, where the compiler has little choice to
compute the address of the tupdesc->compact_attrs[firstNeededCol]. That
matters only when only deforming a small number of columns, of course.

> I've also resequenced the patches to make the deform_bench test module
> part of the 0001 patch. This makes it easier to test the performance
> of master.

What are your thoughts about merging the deform_bench tooling? I wonder if we
should have src/test/modules/benchmark_tools or such, so we can add a few more
micro-benchmarky tools over time?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2026-02-24 14:58:38 Re: [PATCH] Refactor *_abbrev_convert() functions
Previous Message Sami Imseih 2026-02-24 14:19:55 Re: Proposal: ANALYZE (MODIFIED_STATS) using autoanalyze thresholds