Re: More speedups for tuple deformation

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: More speedups for tuple deformation
Date: 2026-02-25 20:29:01
Message-ID: uhqul2ryci4tyg5ylddjrmf4kybzwb7m5z7rmurhhjp37vrn5f@zgxil7egr62n
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-25 13:05:14 -0500, Andres Freund wrote:
> At least gcc is doing some truly weird shit in the
> firstNonGuaranteed/firstNonCachedOffsetAttr loop "header" (i.e. just before
> the first entrance to the loop) , which leads to the register pressure being
> high, which leads to spilling on the stack, making the few-tuples case slower:
>
> [ lots of stuff trimmed ]
>
> I.e. the compiler creates an offset version of tts_values[tts_nvalid],
> tts_isnull[tts_nvalid], which then creates register allocation pressure,
> because later the original tts_values/tts_isnulll etc are accessed again and
> thus the underlying registers are preserved. And this is all for zero gain,
> from what I can tell, because the acceses are still done with indexed
> addressing (like mov %rdi,(%r12,%rcx,8)), which would work just as
> well if rcx were indexed based on attnum, not zero indexed within the loop.
>
> I see about a 10% improvement if I dissuade the compiler from doing that by
> adding
> __asm__ volatile ("" : "+r"(attnum) : :);
>
> In the loop body.
>
>
> I'm getting to the point where I'd like to just hand write the assembler for
> this stupid function. Gah.

Huh. It, at least partially, seems to be related to using an integer for
attnum et al. Due to us using -fwrapv, the compiler can't actually assume that
an attnum++ won't overflow. An overflow would make the loop trip counts a lot
more complicated. Even with that I don't understand how it ends up
generating such crappy code, but since using size_t fixes it...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2026-02-25 20:31:57 Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Previous Message Antonin Houska 2026-02-25 19:41:15 Re: Adding REPACK [concurrently]