Re: Major pgbench synthetic SELECT workload regression, Ubuntu 23.04+PG15

From: Andres Freund <andres(at)anarazel(dot)de>
To: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Major pgbench synthetic SELECT workload regression, Ubuntu 23.04+PG15
Date: 2023-06-08 22:18:07
Message-ID: 20230608221807.p77h43zotlfvkg65@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-06-08 15:08:57 -0400, Gregory Smith wrote:
> Pushing SELECT statements at socket speeds with prepared statements is a
> synthetic benchmark that normally demos big pgbench numbers. My benchmark
> farm moved to Ubuntu 23.04/kernel 6.2.0-20 last month, and that test is
> badly broken on the system PG15 at larger core counts, with as much as an
> 85% drop from expectations. Since this is really just a benchmark workload
> the user impact is very narrow, probably zero really, but as the severity
> of the problem is high we should get to the bottom of what's going on.

> First round of profile data suggests the lost throughput is going here:
> Overhead Shared Object Symbol
> 74.34% [kernel] [k] osq_lock
> 2.26% [kernel] [k] mutex_spin_on_owner

Could you get a profile with call graphs? We need to know what leads to all
those osq_lock calls.

perf record --call-graph dwarf -a sleep 1

or such should do the trick, if run while the workload is running.

> Quick test to find if you're impacted: on the server and using sockets,
> run a 10 second SELECT test with/without preparation using 1 or 2
> clients/[core|thread] and see if preparation is the slower result. Here's
> a PGDG PG14 on port 5434 as a baseline, next to Ubuntu 23.04's regular
> PG15, all using the PG15 pgbench on AMD 5950X:

I think it's unwise to compare builds of such different vintage. The compiler
options and compiler version can have substantial effects.

> $ pgbench -i -s 100 pgbench -p 5434
> $ pgbench -S -T 10 -c 32 -j 32 -M prepared -p 5434 pgbench
> pgbench (14.8 (Ubuntu 14.8-1.pgdg23.04+1))
> tps = 1058195.197298 (without initial connection time)

I recommend also using -P1. Particularly when using unix sockets, the
specifics of how client threads and server threads are scheduled plays a huge
role. How large a role can change significantly between runs and between
fairly minor changes to how things are executed (e.g. between major PG
versions).

E.g. on my workstation (two sockets, 10 cores/20 threads each), with 32
clients, performance changes back and forth between ~600k and ~850k. Whereas
with 42 clients, it's steadily at 1.1M, with little variance.

I also have seen very odd behaviour on larger machines when
/proc/sys/kernel/sched_autogroup_enabled is set to 1.

> There's been plenty of recent chatter on LKML about *osq_lock*, in January
> Intel reported a 20% benchmark regression on UnixBench that might be
> related. Work is still ongoing this week:

I've seen such issues in the past, primarily due to contention internal to
cgroups, when the memory controller is enabled. IIRC that could be alleviated
to a substantial degree with cgroup.memory=nokmem.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-08 22:23:13 Re: Major pgbench synthetic SELECT workload regression, Ubuntu 23.04+PG15
Previous Message Tomas Vondra 2023-06-08 22:17:36 Re: index prefetching