Re: Improve pg_dump dumping publication tables

From: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: John Hsu <chenhao(dot)john(dot)hsu(at)gmail(dot)com>
Subject: Re: Improve pg_dump dumping publication tables
Date: 2020-10-13 19:23:30
Message-ID: 160261701028.1154.3238486867767993908.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested

Hi

I applied the patch to PG master branch successfully and the patch seems to do as described. If there are a lot of tables created in the Postgres and only a few of them having publications, the getPublicationTables() function will loop through all the tables and check each one to see if it is the desired relation type and if it contains one or more publications. So potentially a lot of looping and checking may happen. In John's patch, all these looping and checking are replaced with one query that will return all the tables that have publications. One problem though, is that, if there are a million tables created in the server like you say and all of them have publications, the query can return a million rows, which will require a lot of memory on client side and the pg_dump client may need to handle this extreme case with proper pagination, which adds complexity to the function's logic. Existing logic does not have this problem, because it simply queries a table's publication one at a time and do it a million times.

thank you
Cary Huang
HighGo Software

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-10-13 19:32:14 Re: [Patch] Using Windows groups for SSPI authentication
Previous Message Andres Freund 2020-10-13 19:21:41 Re: gs_group_1 crashing on 13beta2/s390x