Re: pg_upgrade: can I use same binary for old & new?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Pierre Fortin <pf(at)pfortin(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade: can I use same binary for old & new?
Date: 2025-07-05 20:06:49
Message-ID: 0de4c4cb-7c54-4c4d-a50d-6ef6e005f724@aklaver.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/5/25 12:19, Pierre Fortin wrote:
> On Sat, 05 Jul 2025 14:30:20 -0400 Tom Lane wrote:
>
> Forgive my ignorance; always trying to learn more... :)
>
>> pf(at)pfortin(dot)com writes:
>>> On Sat, 5 Jul 2025 11:11:32 -0700 Adrian Klaver wrote:
>>>> How did you measure above?
>>
>>> # du -sb /var/lib/pgsql/data
>>> 8227910662297 /var/lib/pgsql/data
>>
>> It's likely that there's a deal of bloat in that. Even if there's not
>> much bloat, this number will include indexes and WAL data that don't
>> appear in pg_dump output.
>
> Does this imply that on restore, I'll have to re-index everything?

The dump file includes CREATE INDEX commands and per:

https://www.postgresql.org/docs/current/sql-createindex.html

"Creating an index can interfere with regular operation of a database.
Normally PostgreSQL locks the table to be indexed against writes and
performs the entire index build with a single scan of the table. Other
transactions can still read the table, but if they try to insert,
update, or delete rows in the table they will block until the index
build is finished. This could have a severe effect if the system is a
live production database. Very large tables can take many hours to be
indexed, and even for smaller tables, an index build can lock out
writers for periods that are unacceptably long for a production system."

Which is why pg_restore:

https://www.postgresql.org/docs/current/app-pgrestore.html

has:

"-j number-of-jobs
--jobs=number-of-jobs

Run the most time-consuming steps of pg_restore — those that load
data, create indexes, or create constraints — concurrently, using up to
number-of-jobs concurrent sessions. This option can dramatically reduce
the time to restore a large database to a server running on a
multiprocessor machine. This option is ignored when emitting a script
rather than connecting directly to a database server."

>
>>>> What was the pg_dump command?
>>
>>> Didn't try given:
>>> $ df /mnt/db
>>> Filesystem Size Used Avail Use% Mounted on
>>> /dev/sdh1 17T 13T 3.0T 82% /mnt/db
>>
>> I'd say give it a try; be sure to use one of the pg_dump modes
>> that compress the data.
>
> OK... I failed to mention I have several databases in this cluster; so
> digging into pg_dumpall, I see:
> --binary-upgrade
> This option is for use by in-place upgrade utilities. Its use for
> other purposes is not recommended or supported. The behavior of the
> option may change in future releases without notice.
>
> pg_upgrade has --link option; but I'm puzzled by this option in a
> dumpall/restore process. My imagination wonders if this alludes to a way
> to do something like:
> pg_dumpall --globals-only --roles-only --schema-only ...
> Would restoring this be a way to update only the control structures? Big
> assumption that the actual data remains untouched...
>
> Inquiring mind... :)
>
> Back to my upgrade issue...
> All my DBs are static (only queries once loaded). Assuming the dumpall
> file fits on one of my drives:
> pg_dumpall -f <path>/PG.backup -v
> appears to be all I need? pg_dump has compression by default; but I don't
> see compression with dumpall other than for TOAST.
>
> Thanks, You guys are awesome!
>
>> regards, tom lane
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2025-07-05 20:23:18 Re: pg_upgrade: can I use same binary for old & new?
Previous Message Adrian Klaver 2025-07-05 19:58:10 Re: pg_upgrade: can I use same binary for old & new?