Quick Links

Re: pg_upgrade failing for 200+ million Large Objects

From:	vignesh C <vignesh21(at)gmail(dot)com>
To:	"Kumar, Sachin" <ssetiya(at)amazon(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robins Tharakan <tharakan(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_upgrade failing for 200+ million Large Objects
Date:	2024-01-26 14:42:44
Message-ID:	CALDaNm08DLqkr6LQfB=AJ-wEOGtASW+rYGsD6Q56-bYBYoHFdA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 2 Jan 2024 at 23:03, Kumar, Sachin <ssetiya(at)amazon(dot)com> wrote:
>
> > On 11/12/2023, 01:43, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> > I had initially supposed that in a parallel restore we could
> > have child workers also commit after every N TOC items, but was
> > soon disabused of that idea. After a worker processes a TOC
> > item, any dependent items (such as index builds) might get
> > dispatched to some other worker, which had better be able to
> > see the results of the first worker's step. So at least in
> > this implementation, we disable the multi-command-per-COMMIT
> > behavior during the parallel part of the restore. Maybe that
> > could be improved in future, but it seems like it'd add a
> > lot more complexity, and it wouldn't make life any better for
> > pg_upgrade (which doesn't use parallel pg_restore, and seems
> > unlikely to want to in future).
>
> I was not able to find email thread which details why we are not using
> parallel pg_restore for pg_upgrade. IMHO most of the customer will have single large
> database, and not using parallel restore will cause slow pg_upgrade.
>
> I am attaching a patch which enables parallel pg_restore for DATA and POST-DATA part
> of dump. It will push down --jobs value to pg_restore and will restore database sequentially.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
46a0cd4cefb4d9b462d8cc4df5e7ecdd190bea92 ===
=== applying patch ./v9-005-parallel_pg_restore.patch
patching file src/bin/pg_upgrade/pg_upgrade.c
Hunk #3 FAILED at 650.
1 out of 3 hunks FAILED -- saving rejects to file
src/bin/pg_upgrade/pg_upgrade.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4713.log

Regards,
Vignesh

In response to

Re: pg_upgrade failing for 200+ million Large Objects at 2024-01-02 17:33:00 from Kumar, Sachin

Responses

Re: pg_upgrade failing for 200+ million Large Objects at 2024-01-26 16:44:26 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	vignesh C	2024-01-26 14:45:52	Re: PoC: prefetching index leaf pages (for inserts)
Previous Message	Maiquel Grassi	2024-01-26 14:41:46	RE: Current Connection Information