Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Salvatore Dipietro <dipietro(dot)salvatore(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Salvatore Dipietro <dipiets(at)amazon(dot)com>, blakgeof(at)amazon(dot)com
Subject: Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture
Date: 2025-08-15 20:25:20
Message-ID: aJ-XsB542yajZzij@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 15, 2025 at 04:13:30PM -0400, Andres Freund wrote:
> IMO, the only way to actually make pg_stat_statements scale is to move to a
> model much more like our regular stats. I.e. accumulate counters in backend
> local memory and only occasionally update the shared stats.

Agreed. I remember discussing something similar at pgconf.dev this year.

> FWIW, I'd not be surprised if moving to atomics would often cause *slowdowns*
> compared to using the spinlocks. You'd replace one atomic operation with
> dozens, to update all those fields individually. With loads of cacheline
> pingpong inbetween.

Right. At some point I tried moving most things to atomics and leaving the
rest behind the spinlock, and IIRC there wasn't really any improvement. I
didn't dig into whether that was because of atomic-related cache line
ping-pong or the existing spinlock, but regardless, I quickly discarded
that idea.

--
nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-08-15 20:54:29 Re: Remove Instruction Synchronization Barrier in spin_delay() for ARM64 architecture
Previous Message Peter Geoghegan 2025-08-15 20:16:02 Re: index prefetching