| From: | "Hsu, John" <hsuchen(at)amazon(dot)com> | 
|---|---|
| To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Improve pg_dump dumping publication tables | 
| Date: | 2020-09-04 18:03:39 | 
| Message-ID: | EA6F1B1A-8081-4200-A83F-FB9DAC6B49D0@amazon.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi hackers,
I was wondering if there's a good reason in pg_dump getPublicationTables() 
to iterate through all tables one by one and querying to see if it has a 
corresponding publication other than memory concerns?
Attached is a patch to construct the publications by querying for all publications 
at once and then finding the corresponding tableInfo information in tblinfoindex. 
It seems more likely that users will have a lot more tables than publications so 
this should be a lot faster, especially when there's millions of tables.
Setup: 
time pg_dump --schema-only -d postgres -f dump.sql -v 2> dump.log
10k tables, each table having its own publication.
Before patch:
real	0m6.409s
user	0m0.234s
sys	0m0.584s
With patch:
real	0m5.483s
user	0m0.399s
sys	0m0.443s
10k tables, no publications.
Before patch:
real	0m6.193s
user	0m0.276s
sys	0m0.531s
With patch:
real	0m5.261s
user	0m0.254s
sys	0m0.377s
Thanks,
John H
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-pg_dump-Iterate-through-all-publication-tables-in-pg.patch | application/octet-stream | 5.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexey Kondratov | 2020-09-04 18:31:14 | Re: Global snapshots | 
| Previous Message | Andrew Dunstan | 2020-09-04 18:00:13 | Re: pg_dump bug for extension owned tables |