Reducing connection overhead in pg_upgrade compat check phase

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Reducing connection overhead in pg_upgrade compat check phase
Date: 2023-02-17 21:44:49
Message-ID: BBB4C76F-D416-4F9F-949E-DBE950D37787@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When adding a check to pg_upgrade a while back I noticed in a profile that the
cluster compatibility check phase spend a lot of time in connectToServer. Some
of this can be attributed to data type checks which each run serially in turn
connecting to each database to run the check, and this seemed like a place
where we can do better.

The attached patch moves the checks from individual functions, which each loops
over all databases, into a struct which is consumed by a single umbrella check
where all data type queries are executed against a database using the same
connection. This way we can amortize the connectToServer overhead across more
accesses to the database.

In the trivial case, a single database, I don't see a reduction of performance
over the current approach. In a cluster with 100 (empty) databases there is a
~15% reduction in time to run a --check pass. While it won't move the earth in
terms of wallclock time, consuming less resources on the old cluster allowing
--check to be cheaper might be the bigger win.

--
Daniel Gustafsson

Attachment Content-Type Size
0001-pg_upgrade-run-all-data-type-checks-per-connection.patch application/octet-stream 34.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-02-17 21:56:24 archive modules loose ends
Previous Message Andres Freund 2023-02-17 20:50:09 Re: Move defaults toward ICU in 16?