Quick Links

improve performance of pg_dump with many sequences

From:	Nathan Bossart <nathandbossart(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	improve performance of pg_dump with many sequences
Date:	2024-05-03 02:51:40
Message-ID:	20240503025140.GA1227404@nathanxps13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Similar to 'pg_dump --binary-upgrade' [0], we can speed up pg_dump with
many sequences by gathering the required information in a single query
instead of two queries per sequence. The attached patches are
works-in-progress, but here are the results I see on my machine for
'pg_dump --schema-only --binary-upgrade' with a million sequences:

HEAD : 6m22.809s
[0] : 1m54.701s
[0] + attached : 0m38.233s

I'm not sure I have all the details correct in 0003, and we might want to
separate the table into two tables which are only populated when the
relevant section is dumped. Furthermore, the query in 0003 is a bit goofy
because it needs to dance around a bug reported elsewhere [1].

[0] https://postgr.es/m/20240418041712.GA3441570%40nathanxps13
[1] https://postgr.es/m/20240501005730.GA594666%40nathanxps13

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
v1-0001-parse-sequence-information.patch	text/x-diff	4.0 KB
v1-0002-cache-sequence-information.patch	text/x-diff	7.4 KB
v1-0003-cache-more-sequence-data.patch	text/x-diff	4.3 KB

Responses

Re: improve performance of pg_dump with many sequences at 2024-07-09 19:11:51 from Nathan Bossart

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrei Lepikhov	2024-05-03 03:19:58	Re: Removing unneeded self joins
Previous Message	Nathan Bossart	2024-05-03 01:04:15	Re: allow changing autovacuum_max_workers without restarting