Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date: 2025-10-21 13:57:31
Message-ID: 83r36or2-q8n9-6q18-np80-q9n6pr3q636q@tzk.arg
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday 2025-10-21 00:23, Tom Lane wrote:

> HEAD repeats
>
> read(4k)
> lseek(~128k forward)
>
> which is to be expected if we have to read data block headers
> that are ~128K apart; while patched repeats
>
> read(4k)
> read(~128k)
>
> which is a bit odd in itself, why isn't it merging the reads better?

The read(4k) happens because of the getc() calls that read the next
block's length.

As noticed in a message above [1], glibc seems to do 4KB buffering by
default, for some reason. setvbuf() can mitigate this.

[1] https://www.postgresql.org/message-id/1po8os49-r63o-2923-p37n-12698o1qn7p0%40tzk.arg

I'm attaching a patch that sets glibc buffering to 1MB just as a proof
of concept. It's obviously WIP, it allocates and never frees. :-)
Feel free to pick it up and change it as you see fit.
With this patch, read() calls are unified in strace. lseeks() remain,
even if they are not actually reading anything.

It seems to me that glibc could implement an optimisation for fseeko():
store the current position in the file, and do not issue the lseek()
system call if the position does not change.

>> I was using an HDD,
>
> Ah. Your original message mentioned NVMe so I was assuming you
> were also looking at solid-state drives. I can imagine that
> seeking is more painful on HDDs ...

Sorry for the confusion, in all this time I've run tests on too many
different hardware combinations. Not the best way to draw conclusions,
but it's what I had available at each time.

Dimitris

Attachment Content-Type Size
v1-0001-WIP-increase-glibc-buffering-for-pg_restore-custo.patch text/x-patch 1.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2025-10-21 14:10:15 Re: CI: Add task that runs pgindent
Previous Message Daniel Gustafsson 2025-10-21 13:46:23 Re: CI: Add task that runs pgindent