Quick Links

Re: Patch: dumping tables data in multiple chunks in pg_dump

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Hannu Krosing <hannuk(at)google(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>
Subject:	Re: Patch: dumping tables data in multiple chunks in pg_dump
Date:	2025-11-17 04:15:17
Message-ID:	CAFiTN-tV4jWKN75E5YLB-jSqb8j0E1PctiDjztv=ccfbe3YPmg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 11, 2025 at 9:00 PM Hannu Krosing <hannuk(at)google(dot)com> wrote:
>
> Attached is a patch that adds the ability to dump table data in multiple chunks.
>
> Looking for feedback at this point:
> 1) what have I missed
> 2) should I implement something to avoid single-page chunks
>
> The flag --huge-table-chunk-pages which tells the directory format
> dump to dump tables where the main fork has more pages than this in
> multiple chunks of given number of pages,
>
> The main use case is speeding up parallel dumps in case of one or a
> small number of HUGE tables so parts of these can be dumped in
> parallel.
>

+1 for the idea, I haven't done the detailed review but I was just
going through the patch, I noticed that we use pg_class->relpages to
identify whether to chunk the table or not, which should be fine but
don't you think if we use direct size calculation function like
pg_relation_size() we might get better idea and not dependent upon
whether the stats are updated or not? This will make chunking
behavior more deterministic.

--
Regards,
Dilip Kumar
Google

In response to

Patch: dumping tables data in multiple chunks in pg_dump at 2025-11-11 15:29:56 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amul Sul	2025-11-17 04:50:48	Re: pg_waldump: support decoding of WAL inside tarfile
Previous Message	jian he	2025-11-17 04:06:16	misleading error message in DefineIndex