Re: Re: Refresh Publication takes hours and doesn´t finish

From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, PegoraroF10 <marcos(at)f10(dot)com(dot)br>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Re: Refresh Publication takes hours and doesn´t finish
Date: 2019-05-21 19:47:28
Message-ID: CAFcNs+rEhcQQcTHy0Deq84mgfuoiLAW=C0QTntVaPr5pfNCHdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, May 21, 2019 at 4:42 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> [ redirecting to pgsql-hackers as the more relevant list ]
>
> I wrote:
> > PegoraroF10 <marcos(at)f10(dot)com(dot)br> writes:
> >> I tried sometime ago ... but with no responses, I ask you again.
> >> pg_publication_tables is a view that is used to refresh publication,
but as
> >> we have 15.000 tables, it takes hours and doesn't complete. If I
change that
> >> view I can have an immediate result. The question is: Can I change
that view
> >> ? There is some trouble changing those system views ?
>
> > Hmm ... given that pg_get_publication_tables() shouldn't return any
> > duplicate OIDs, it does seem unnecessarily inefficient to put it in
> > an IN-subselect condition. Peter, is there a reason why this isn't
> > a straight lateral join? I get a much saner-looking plan from
>
> > FROM pg_publication P, pg_class C
> > - JOIN pg_namespace N ON (N.oid = C.relnamespace)
> > - WHERE C.oid IN (SELECT relid FROM
pg_get_publication_tables(P.pubname));
> > + JOIN pg_namespace N ON (N.oid = C.relnamespace),
> > + LATERAL pg_get_publication_tables(P.pubname)
> > + WHERE C.oid = pg_get_publication_tables.relid;
>
> For the record, the attached seems like what to do here. It's easy
> to show that there's a big performance gain even for normal numbers
> of tables, eg if you do
>
> CREATE PUBLICATION mypub FOR ALL TABLES;
> SELECT * FROM pg_publication_tables;
>
> in the regression database, the time for the select drops from ~360ms
> to ~6ms on my machine. The existing view's performance will drop as
> O(N^2) the more publishable tables you have ...
>
> Given that this change impacts the regression test results, project
> rules say that it should come with a catversion bump. Since we are
> certainly going to have a catversion bump before beta2 because of
> the pg_statistic_ext permissions business, that doesn't seem like
> a reason not to push it into v12 --- any objections?
>

I completely agree to push it into v12.

Regards,

--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jeremy Finzel 2019-05-21 21:57:47 Re: Bulk inserts into two (related) tables
Previous Message Tom Lane 2019-05-21 19:42:40 Re: Re: Refresh Publication takes hours and doesn´t finish

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-05-21 19:47:34 Re: PG 12 draft release notes
Previous Message Tom Lane 2019-05-21 19:42:40 Re: Re: Refresh Publication takes hours and doesn´t finish