Quick Links

Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward

From:	Dimitrios Apostolou <jimis(at)gmx(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date:	2025-10-21 14:16:26
Message-ID:	9o35ps41-040p-n471-qoos-6q8648594768@tzk.arg
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tuesday 2025-10-21 00:15, Tom Lane wrote:

>> So for me, the proposed patch actually makes it 2X slower.
>
> I went and tried this same test case on a 2024 Mac Mini M4 Pro.
> Cutting to the chase:
>
> HEAD:
>
> $ time pg_restore -f /dev/null -t zedtable bench10000.dump
>
> real 1m26.525s
> user 0m0.364s
> sys 0m6.806s
>
> Patched:
>
> $ time pg_restore -f /dev/null -t zedtable bench10000.dump
>
> real 0m15.419s
> user 0m0.279s
> sys 0m8.224s
>
> So on this hardware it *does* win (although maybe things would
> be different for a parallel restore). The patched pg_restore
> takes just about the same amount of time as "cat", and iostat
> shows both of them reaching a bit more than 6GB/s read speed.
>
> My feeling at this point is that we'd probably drop the block
> size test as irrelevant, and instead simply ignore ctx->hasSeek
> within this loop if we think we're on a platform where that's
> the right thing. But how do we figure that out?
>
> Not sure where we go from here, but clearly a bunch of research
> is going to be needed to decide whether this is committable.

pg_dump files from before your latest fix still exist, and they possibly
contain block header every 30 bytes (or however wide is the table rows).
A patch in pg_restore would vastly improve this use case.

May I suggest the attached patch, which replaces fseeko() with fread()
if the distance is 32KB or less? Sounds rather improbable that this
would make things worse, but maybe it's possible to generate a dump file
with 32KB wide rows, and try restoring on various hardware?

If this too is controversial, then we can reduce the number to 4KB. This
is the buffering that glibc does internally. By using the same in the
given patch, we avoid all the lseek(same-offset) repetitions between the
4K reads. This should be a strict gain, with no downsides.

Dimitris

Attachment	Content-Type	Size
v5-0001-parallel-pg_restore-avoid-disk-seeks-when-moving-.patch	text/x-patch	2.6 KB

In response to

Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward at 2025-10-20 22:15:13 from Tom Lane

Responses

Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward at 2025-10-21 17:36:53 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Matheus Alcantara	2025-10-21 14:25:19	Re: postgres_fdw: Use COPY to speed up batch inserts
Previous Message	Aya Iwata (Fujitsu)	2025-10-21 14:14:34	RE: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE