improve performance of pg_dump with many sequences

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: improve performance of pg_dump with many sequences
Date: 2024-05-03 02:51:40
Message-ID: 20240503025140.GA1227404@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Similar to 'pg_dump --binary-upgrade' [0], we can speed up pg_dump with
many sequences by gathering the required information in a single query
instead of two queries per sequence. The attached patches are
works-in-progress, but here are the results I see on my machine for
'pg_dump --schema-only --binary-upgrade' with a million sequences:

HEAD : 6m22.809s
[0] : 1m54.701s
[0] + attached : 0m38.233s

I'm not sure I have all the details correct in 0003, and we might want to
separate the table into two tables which are only populated when the
relevant section is dumped. Furthermore, the query in 0003 is a bit goofy
because it needs to dance around a bug reported elsewhere [1].

[0] https://postgr.es/m/20240418041712.GA3441570%40nathanxps13
[1] https://postgr.es/m/20240501005730.GA594666%40nathanxps13

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-parse-sequence-information.patch text/x-diff 4.0 KB
v1-0002-cache-sequence-information.patch text/x-diff 7.4 KB
v1-0003-cache-more-sequence-data.patch text/x-diff 4.3 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-05-03 03:19:58 Re: Removing unneeded self joins
Previous Message Nathan Bossart 2024-05-03 01:04:15 Re: allow changing autovacuum_max_workers without restarting