Quick Links

Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc:	Dimitrios Apostolou <jimis(at)gmx(dot)net>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date:	2025-10-14 02:44:45
Message-ID:	863353.1760409885@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>> On Oct 14, 2025, at 08:36, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The thing we are really interested in here is how fast pg_restore
>> can skip over unwanted table data in a large archive file, and that
>> I believe should be pretty sensitive to block size.

> Not sure if I did something wrong, but I still don’t see much difference between buffer size 4K and 128K with your suggested test.
>
> % time pg_dump -Fc -f db.dump evantest

This won't show the effect, because pg_dump will be able to go back
and insert data offsets into the dump's TOC, so pg_restore can just
seek to where the data is. See upthread discussion about what's
needed to provoke Dimitrios' problem.

I tried this very tiny (relatively speaking) test case:

regression=# create database d1;
CREATE DATABASE
regression=# \c d1
You are now connected to database "d1" as user "postgres".
d1=# create table alpha as select repeat(random()::text, 1000) from generate_series(1,1000000);
SELECT 1000000
d1=# create table omega as select 42 as x;
SELECT 1
d1=# \q

Then

$ pg_dump -Fc d1 | cat >d1.dump
$ time pg_restore -f /dev/null -t omega d1.dump

The point of the pipe-to-cat is to reproduce Dimitrios' problem case
with no data offsets in the TOC. Then the restore is doing about the
simplest thing I can think of to make it skip over most of the archive
file. Also, I'm intentionally using the default choice of gzip
because that already responds to DEFAULT_IO_BUFFER_SIZE properly.
(This test is with current HEAD, no patches except adjusting
DEFAULT_IO_BUFFER_SIZE.)

I got these timings:

DEFAULT_IO_BUFFER_SIZE = 1K
real 0m0.020s
user 0m0.002s
sys 0m0.017s

DEFAULT_IO_BUFFER_SIZE = 4K
real 0m0.014s
user 0m0.003s
sys 0m0.011s

DEFAULT_IO_BUFFER_SIZE = 128K
real 0m0.002s
user 0m0.000s
sys 0m0.002s

This test case has only about 50MB worth of compressed data,
so of course the times are very small; scaling it up to
gigabytes would yield more impressive results. But the
effect is clearly visible.

regards, tom lane

In response to

Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward at 2025-10-14 01:37:45 from Chao Li

Responses

Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward at 2025-10-14 05:36:07 from Chao Li

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Hayato Kuroda (Fujitsu)	2025-10-14 03:13:15	RE: Logical Replication of sequences
Previous Message	jian he	2025-10-14 02:28:10	Re: [PATCH] Add pg_get_trigger_ddl() to retrieve the CREATE TRIGGER statement