| From: | Hannu Krosing <hannuk(at)google(dot)com> |
|---|---|
| To: | Michael Banck <mbanck(at)gmx(dot)net> |
| Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com> |
| Subject: | Re: Patch: dumping tables data in multiple chunks in pg_dump |
| Date: | 2026-03-28 15:33:59 |
| Message-ID: | CAMT0RQTe4Zr=rdcKMJj-=c7CH0PJh=ZPk=xOU98+M7p9-D+Yew@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
The above
"Or it can be almost 200 GB if the page has just pointers to 1GB TOAST items."
should read
"Or it can be almost 200 GB *for a single page* if the page has just
pointers to 1GB TOAST items."
On Sat, Mar 28, 2026 at 4:32 PM Hannu Krosing <hannuk(at)google(dot)com> wrote:
>
> The issue is that currently the value is given in "main table pages"
> and it would be somewhat deceptive, or at least confusing, to try to
> express this in any other unit.
>
> As I explained in the commit message:
>
> ---------8<-------------------8<-------------------8<----------------
> This --max-table-segment-pages number specifically applies to main table
> pages which does not guarantee anything about output size.
> The output could be empty if there are no live tuples in the page range.
> Or it can be almost 200 GB if the page has just pointers to 1GB TOAST items.
> ---------8<-------------------8<-------------------8<----------------
>
> And I can think of no cheap and reliable way to change that equation.
>
> I'll be very happy if you have any good ideas for either improving the
> flag name, or even propose a way to better estimate the resulting dump
> file size so we could give the chunk size in better units
>
> ---
> Hannu
>
>
>
>
>
> On Sat, Mar 28, 2026 at 12:26 PM Michael Banck <mbanck(at)gmx(dot)net> wrote:
> >
> > Hi,
> >
> > On Tue, Jan 13, 2026 at 03:27:25PM +1300, David Rowley wrote:
> > > Perhaps --max-table-segment-pages is a better name than
> > > --huge-table-chunk-pages as it's quite subjective what the minimum
> > > number of pages required to make a table "huge".
> >
> > I'm not sure that's better - without looking at the documentation,
> > people might confuse segment here with the 1GB split of tables into
> > segments. As pg_dump is a very common and basic user tool, I don't think
> > implementation details like pages/page sizes and blocks should be part
> > of its UX.
> >
> > Can't we just make it a storage size, like '10GB' and then rename it to
> > --table-parallel-threshold or something? I agree it's bikeshedding, but
> > I personally don't like either --max-table-segment-pages or
> > --huge-table-chunk-pages.
> >
> >
> > Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David E. Wheeler | 2026-03-28 15:58:33 | Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part |
| Previous Message | Hannu Krosing | 2026-03-28 15:32:23 | Re: Patch: dumping tables data in multiple chunks in pg_dump |