| From: | Hannu Krosing <hannuk(at)google(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
| Subject: | Patch: dumping tables data in multiple chunks in pg_dump |
| Date: | 2025-11-11 15:29:56 |
| Message-ID: | CAMT0RQT_0qVxcTT6ycM20QUN-pEQ6iMLbz6gLWgLpeF0NmNOUA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Attached is a patch that adds the ability to dump table data in multiple chunks.
Looking for feedback at this point:
1) what have I missed
2) should I implement something to avoid single-page chunks
The flag --huge-table-chunk-pages which tells the directory format
dump to dump tables where the main fork has more pages than this in
multiple chunks of given number of pages,
The main use case is speeding up parallel dumps in case of one or a
small number of HUGE tables so parts of these can be dumped in
parallel.
It will also help in case the target file system has some limitations
on file sizes (4GB for FAT, 5TB for GCS).
Currently no tests are included in the patch and also no extra
documentation outside what is printed out by pg_dump --help . Also any
pg_log_warning lines with "CHUNKING" is there for debugging and needs
to be removed before committing.
As implemented no changes are needed for pg_restore as all chunks are
already associated with the table in .toc and thus are restored into
this table
the attached README shows how I verified it works and the textual
file created from the directory format dump in the last step there
--
Hannu
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-adds-ability-to-dump-data-for-tables-in-multiple-chu.patch | application/x-patch | 11.5 KB |
| README.pg_dump.md | text/markdown | 3.7 KB |
| dump.sql | application/sql | 56.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Álvaro Herrera | 2025-11-11 15:34:10 | Re: Document NULL |
| Previous Message | Fujii Masao | 2025-11-11 15:22:38 | Re: Suggestion to add --continue-client-on-abort option to pgbench |