Quick Links

Improve pg_dump dumping publication tables

From:	"Hsu, John" <hsuchen(at)amazon(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Improve pg_dump dumping publication tables
Date:	2020-09-04 18:03:39
Message-ID:	EA6F1B1A-8081-4200-A83F-FB9DAC6B49D0@amazon.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

I was wondering if there's a good reason in pg_dump getPublicationTables()
to iterate through all tables one by one and querying to see if it has a
corresponding publication other than memory concerns?

Attached is a patch to construct the publications by querying for all publications
at once and then finding the corresponding tableInfo information in tblinfoindex.

It seems more likely that users will have a lot more tables than publications so
this should be a lot faster, especially when there's millions of tables.

Setup:
time pg_dump --schema-only -d postgres -f dump.sql -v 2> dump.log
10k tables, each table having its own publication.

Before patch:
real 0m6.409s
user 0m0.234s
sys 0m0.584s

With patch:
real 0m5.483s
user 0m0.399s
sys 0m0.443s

10k tables, no publications.

Before patch:
real 0m6.193s
user 0m0.276s
sys 0m0.531s

With patch:
real 0m5.261s
user 0m0.254s
sys 0m0.377s

Thanks,

John H

Attachment	Content-Type	Size
0001-pg_dump-Iterate-through-all-publication-tables-in-pg.patch	application/octet-stream	5.8 KB

Responses

Re: Improve pg_dump dumping publication tables at 2020-10-13 19:23:30 from Cary Huang
Re: Improve pg_dump dumping publication tables at 2021-01-15 20:22:03 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexey Kondratov	2020-09-04 18:31:14	Re: Global snapshots
Previous Message	Andrew Dunstan	2020-09-04 18:00:13	Re: pg_dump bug for extension owned tables