Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitrios Apostolou <jimis(at)gmx(dot)net>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date: 2025-10-14 01:37:45
Message-ID: 2E6A307F-BDB9-498F-AC93-B31E0023AD20@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 14, 2025, at 08:36, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
>> I tested DEFAULT_IO_BUFFER_SIZE with 4K, 32K, 64K, 128K and 256K. Looks like increasing the buffer size doesn’t improve the performance significantly. Actually, with the buffer size 64K, 128K and 256K, the test results are very close. I tested both with lz4 and none compression. I am not suggesting tuning the buffer size. These data are only for your reference.
>
> Yeah, I would not expect straight pg_dump/pg_restore performance
> to vary very much once the buffer size gets above not-too-many KB.
> The thing we are really interested in here is how fast pg_restore
> can skip over unwanted table data in a large archive file, and that
> I believe should be pretty sensitive to block size.
>
> You could measure that without getting into the complexities of
> parallel restore if you make a custom-format dump of a few large
> tables that does not have offset data in it, and then seeing how
> fast is selective restore of just the last table.
>
>

Not sure if I did something wrong, but I still don’t see much difference between buffer size 4K and 128K with your suggested test.

I created 3 tables, each with 10 millions of rows:

```
evantest=# CREATE TABLE tbl1 AS SELECT generate_series(1,10000000) AS id;
SELECT 10000000
evantest=# CREATE TABLE tbl2 AS SELECT generate_series(1,10000000) AS id;
SELECT 10000000
evantest=# CREATE TABLE tbl3 AS SELECT generate_series(1,10000000) AS id;
SELECT 10000000
```

And did a custom-format dump:
```
% time pg_dump -Fc -f db.dump evantest
pg_dump -Fc -f db.dump evantest 51.72s user 1.13s system 98% cpu 53.602 total
```

Then pg_restore the last tabl, compiled with buffer size 4k and 128k: (I dropped tbl3 before running pg_restore)
```
# 4K ===
% time pg_restore -d evantest -t tbl3 db.dump
pg_restore -d evantest -t tbl3 db.dump 0.06s user 0.04s system 6% cpu 1.528 total

# 128K
% time pg_restore -d evantest -t tbl3 db.dump
pg_restore -d evantest -t tbl3 db.dump 0.05s user 0.04s system 3% cpu 2.146 total
```

The other thing I noticed is that, when I do custom-format dump, if a target file exists, pg_dump will just go ahead overwrite the existing file; however, when I do directory dump, if a target dir exists, pg_dump will fail with an error “directory xxx is not empty”. Why the behaviors are different?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Alger 2025-10-14 01:59:18 [PATCH] Add pg_get_trigger_ddl() to retrieve the CREATE TRIGGER statement
Previous Message Tom Lane 2025-10-14 00:36:07 Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward