From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Kumar, Sachin" <ssetiya(at)amazon(dot)com> |
Cc: | Robins Tharakan <tharakan(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_upgrade failing for 200+ million Large Objects |
Date: | 2024-01-05 20:02:34 |
Message-ID: | 1144201.1704484954@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> "Kumar, Sachin" <ssetiya(at)amazon(dot)com> writes:
>> I was not able to find email thread which details why we are not using
>> parallel pg_restore for pg_upgrade.
> Well, it's pretty obvious isn't it? The parallelism is being applied
> at the per-database level instead.
On further reflection, there is a very good reason why it's done like
that. Because pg_upgrade is doing schema-only dump and restore,
there's next to no opportunity for parallelism within either pg_dump
or pg_restore. There's no data-loading steps, and there's no
index-building either, so the time-consuming stuff that could be
parallelized just isn't happening in pg_upgrade's usage.
Now it's true that my 0003 patch moves the needle a little bit:
since it makes BLOB creation (as opposed to loading) parallelizable,
there'd be some hope for parallel pg_restore doing something useful in
a database with very many blobs. But it makes no sense to remove the
existing cross-database parallelism in pursuit of that; you'd make
many more people unhappy than happy.
Conceivably something could be salvaged of your idea by having
pg_upgrade handle databases with many blobs differently from
those without, applying parallelism within pg_restore for the
first kind and then using cross-database parallelism for the
rest. But that seems like a lot of complexity compared to the
possible win.
In any case I'd stay far away from using --section in pg_upgrade.
Too many moving parts there.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2024-01-05 20:05:20 | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Previous Message | Alexander Lakhin | 2024-01-05 20:00:00 | Re: Add a perl function in Cluster.pm to generate WAL |