Skip site navigation (1) Skip section navigation (2)

Re: pg_dump and thousands of schemas

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Craig James <cjames(at)emolecules(dot)com>, Hugo <hugo(dot)tech(at)gmail(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump and thousands of schemas
Date: 2012-05-25 03:54:55
Message-ID: 20120525035455.GC25444@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:
> On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames(at)emolecules(dot)com> wrote:
> >
> >
> > On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo(dot)tech(at)gmail(dot)com> wrote:
> >>
> >> Hi everyone,
> >>
> >> We have a production database (postgresql 9.0) with more than 20,000
> >> schemas
> >> and 40Gb size. In the past we had all that information in just one schema
> >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we
> >> decided to split the database into schemas, which makes a lot of sense for
> >> the kind of information we store and the plans we have for the future. The
> >> problem now is that pg_dump takes forever to finish (more than 24 hours)
> >> and
> >> we just can't have consistent daily backups like we had in the past. When
> >> I
> >> try to dump just one schema with almost nothing in it, it takes 12
> >> minutes.
> 
> Sorry, your original did not show up here, so I'm piggy-backing on
> Craig's reply.
> 
> Is dumping just one schema out of thousands an actual use case, or is
> it just an attempt to find a faster way to dump all the schemata
> through a back door?
> 
> pg_dump itself seems to have a lot of quadratic portions (plus another
> one on the server which it hits pretty heavily), and it hard to know
> where to start addressing them.  It seems like addressing the overall
> quadratic nature might be a globally better option, but addressing
> just the problem with dumping one schema might be easier to kluge
> together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

In response to

Responses

pgsql-performance by date

Next:From: Hugo <Nabble>Date: 2012-05-25 04:54:05
Subject: Re: pg_dump and thousands of schemas
Previous:From: Jeff JanesDate: 2012-05-25 03:20:34
Subject: Re: pg_dump and thousands of schemas

pgsql-hackers by date

Next:From: Alex HunsakerDate: 2012-05-25 04:43:18
Subject: Re: plperl_helpers.h fix for clang
Previous:From: Jeff JanesDate: 2012-05-25 03:20:34
Subject: Re: pg_dump and thousands of schemas

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group