Skip site navigation (1) Skip section navigation (2)

Re: pg_dump and thousands of schemas

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump and thousands of schemas
Date: 2012-05-31 08:45:26
Message-ID: 20120531.174526.2183902405630094684.t-ishii@sraoss.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
>>> We recently fixed a couple of O(N^2) loops in pg_dump, but those covered
>>> extremely specific cases that might or might not have anything to do
>>> with what you're seeing.  The complainant was extremely helpful about
>>> tracking down the problems:
>>> http://archives.postgresql.org/pgsql-general/2012-03/msg00957.php
>>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00225.php
>>> http://archives.postgresql.org/pgsql-committers/2012-03/msg00230.php
>> 
>> I'm wondering if these fixes (or today's commit) include the case for
>> a database has ~100 thounsands of tables, indexes. One of my customers
>> has had troubles with pg_dump for the database, it takes over 10
>> hours.
> 
> So I did qucik test with old PostgreSQL 9.0.2 and current (as of
> commit 2755abf386e6572bad15cb6a032e504ad32308cc). In a fresh initdb-ed
> database I created 100,000 tables, and each has two integer
> attributes, one of them is a primary key. Creating tables were
> resonably fast as expected (18-20 minutes). This created a 1.4GB
> database cluster.
> 
> pg_dump dbname >/dev/null took 188 minutes on 9.0.2, which was pretty
> long time as the customer complained. Now what was current?  Well it
> took 125 minutes. Ps showed that most of time was spent in backend.
> 
> Below is the script to create tables.
> 
> cnt=100000
> while [ $cnt -gt 0 ]
> do
> psql -e -p 5432 -c "create table t$cnt(i int primary key, j int);" test
> cnt=`expr $cnt - 1`
> done
> 
> p.s. You need to increate max_locks_per_transaction before running
> pg_dump (I raised to 640 in my case).

Just for record, I rerun the test again with my single-LOCK patch, and
now total runtime of pg_dump is 113 minutes.
188 minutes(9.0)->125 minutes(git master)->113 minutes(with my patch).

So far, I'm glad to see 40% time savings at this point.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

In response to

Responses

pgsql-performance by date

Next:From: Robert KlemmeDate: 2012-05-31 13:38:35
Subject: Re: pg_dump and thousands of schemas
Previous:From: Tatsuo IshiiDate: 2012-05-31 05:29:01
Subject: Re: pg_dump and thousands of schemas

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2012-05-31 09:36:04
Subject: Re: GiST buffering build, bug in levelStep calculation
Previous:From: Kohei KaiGaiDate: 2012-05-31 08:14:57
Subject: Re: [RFC] Interface of Row Level Security

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group