Re: Assorted improvements in pg_dump

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hans Buschmann <buschmann(at)nidsa(dot)net>, pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Assorted improvements in pg_dump
Date: 2021-10-24 22:03:37
Message-ID: 20211024220337.GN9856@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 24, 2021 at 05:10:55PM -0400, Tom Lane wrote:
> 0003 is the same except I added a missing free().
>
> 0004 is a new patch based on an idea from Andres Freund [1]:
> in the functions that repetitively issue the same query against
> different tables, issue just one query and use a WHERE clause
> to restrict the output to the tables we care about. I was
> skeptical about this to start with, but it turns out to be
> quite a spectacular win. On my machine, the time to pg_dump
> the regression database (with "-s") drops from 0.91 seconds
> to 0.39 seconds. For a database with 10000 toy tables, the
> time drops from 18.1 seconds to 2.3 seconds.

+ if (tbloids->len > 1)
+ appendPQExpBufferChar(tbloids, ',');
+ appendPQExpBuffer(tbloids, "%u", tbinfo->dobj.catId.oid);

I think this should say

+ if (tbloids->len > 0)

That doesn't matter much since catalogs aren't dumped as such, and we tend to
count in base 10 and not base 10000.

BTW, the ACL patch makes the overhead 6x lower (6.9sec vs 1.2sec) for pg_dump -t
of a single, small table. Thanks for that.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-10-24 22:43:11 Re: pg_receivewal starting position
Previous Message Alvaro Herrera 2021-10-24 21:52:00 Re: pg_dump versus ancient server versions