Re: pg_dump and thousands of schemas

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hugo <Nabble>" <hugo(dot)tech(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump and thousands of schemas
Date: 2012-05-25 15:53:45
Message-ID: CAMkU=1zedM4VyLVyLuVmoekUnUXkXfnGPer+3bvPm-A_9CNYSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Fri, May 25, 2012 at 8:18 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Fri, May 25, 2012 at 10:41:23AM -0400, Tom Lane wrote:
>> "Hugo <Nabble>" <hugo(dot)tech(at)gmail(dot)com> writes:
>> > If anyone has more suggestions, I would like to hear them. Thank you!
>>
>> Provide a test case?
>>
>> We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
>> extremely specific cases that might or might not have anything to do
>> with what you're seeing.  The complainant was extremely helpful about
>> tracking down the problems:
>> http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php
>
> Yes, please help us improve this!  At this point pg_upgrade is limited
> by the time to dump/restore the database schema, but I can't get users
> to give me any way to debug the speed problems.

For dumping one small schema from a large database, look at the time
progression of this:

dropdb foo; createdb foo;

for f in `seq 0 10000 1000000`; do
perl -le 'print "create schema foo$_; create table foo$_.foo (k
integer, v integer);"
foreach $ARGV[0]..$ARGV[0]+9999' $f | psql -d foo > /dev/null ;
time pg_dump foo -Fc -n foo1 | wc -c;
done >& dump_one_schema_timing

To show the overall dump speed problem, drop the "-n foo1", and change
the step size from 10000/9999 down to 1000/999

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sandro Santilli 2012-05-25 16:06:34 Re: Interrupting long external library calls
Previous Message Jeff Janes 2012-05-25 15:40:04 Re: pg_dump and thousands of schemas

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-05-25 16:56:17 Re: pg_dump and thousands of schemas
Previous Message Greg Spiegelberg 2012-05-25 15:52:22 Millions of relations (from Maximum number of sequences that can be created)