Re: Potential "AIO / io workers" inter-worker locking issue in PG18?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Thom Brown <thom(at)linux(dot)com>
Subject: Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Date: 2025-10-07 23:03:40
Message-ID: 6ro55q35wiahf7mpsbnhyneatiqkz4qrjcfgtg6zo2bvtjwrxy@5fzxrbymhop3
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2025-10-08 00:13:34 +0200, Marco Boeringa wrote:
> This looks much better, doesn't it?

It indeed does!

> I hope this helps. Let me know if you need anything else.

> *** sudo perf -p <PID of one stuck postgres backend> -g -d 10 ***
> *** sudo perf report -g ***

Could you show perf report --no-children? That would show us which individual
functions, rather than call-stacks, take the longest.

> Samples: 40K of event 'task-clock:ppp', Event count (approx.): 10008250000
>   Children      Self  Command   Shared Object      Symbol
> +  100,00%     0,00%  postgres  postgres           [.] _start
> +  100,00%     0,00%  postgres  libc.so.6          [.]
> __libc_start_main@@GLIBC_2.34
> +  100,00%     0,00%  postgres  libc.so.6          [.]
> __libc_start_call_main
> +  100,00%     0,00%  postgres  postgres           [.] main
> +  100,00%     0,00%  postgres  postgres           [.] PostmasterMain
> +  100,00%     0,00%  postgres  postgres           [.] ServerLoop.isra.0
> +  100,00%     0,00%  postgres  postgres           [.]
> postmaster_child_launch
> +  100,00%     0,00%  postgres  postgres           [.] 0x00005f3570fb9dbf
> +  100,00%     0,00%  postgres  postgres           [.] PostgresMain
> +  100,00%     0,00%  postgres  postgres           [.] exec_simple_query
> +  100,00%     0,63%  postgres  postgres           [.] ExecNestLoop
> +  100,00%     0,00%  postgres  postgres           [.] PortalRun
> +  100,00%     0,00%  postgres  postgres           [.] PortalRunMulti
> +  100,00%     0,00%  postgres  postgres           [.] ProcessQuery
> +  100,00%     0,00%  postgres  postgres           [.] standard_ExecutorRun
> +  100,00%     0,00%  postgres  postgres           [.] ExecModifyTable
> +   94,63%     1,47%  postgres  postgres           [.] ExecScan
> +   78,76%     1,49%  postgres  postgres           [.] IndexNext
> +   66,89%     1,96%  postgres  postgres           [.] index_fetch_heap
> +   64,35%     3,61%  postgres  postgres           [.]
> heapam_index_fetch_tuple.lto_priv.0

So somehow >60% of the CPU time is spent fetching tuples corresponding to
index entries. That seems ... a lot. Is it possible that you have a lot of
dead rows in the involved tables?

I don't immediately see how this could be related to AIO.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2025-10-07 23:17:06 Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
Previous Message Marco Boeringa 2025-10-07 22:13:34 Re: Potential "AIO / io workers" inter-worker locking issue in PG18?