Re: pg_dump and thousands of schemas

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Craig James <cjames(at)emolecules(dot)com>, Hugo <hugo(dot)tech(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump and thousands of schemas
Date: 2012-05-25 03:54:55
Message-ID: 20120525035455.GC25444@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:
> On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames(at)emolecules(dot)com> wrote:
> >
> >
> > On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo(dot)tech(at)gmail(dot)com> wrote:
> >>
> >> Hi everyone,
> >>
> >> We have a production database (postgresql 9.0) with more than 20,000
> >> schemas
> >> and 40Gb size. In the past we had all that information in just one schema
> >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we
> >> decided to split the database into schemas, which makes a lot of sense for
> >> the kind of information we store and the plans we have for the future. The
> >> problem now is that pg_dump takes forever to finish (more than 24 hours)
> >> and
> >> we just can't have consistent daily backups like we had in the past. When
> >> I
> >> try to dump just one schema with almost nothing in it, it takes 12
> >> minutes.
>
> Sorry, your original did not show up here, so I'm piggy-backing on
> Craig's reply.
>
> Is dumping just one schema out of thousands an actual use case, or is
> it just an attempt to find a faster way to dump all the schemata
> through a back door?
>
> pg_dump itself seems to have a lot of quadratic portions (plus another
> one on the server which it hits pretty heavily), and it hard to know
> where to start addressing them. It seems like addressing the overall
> quadratic nature might be a globally better option, but addressing
> just the problem with dumping one schema might be easier to kluge
> together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Hunsaker 2012-05-25 04:43:18 Re: plperl_helpers.h fix for clang
Previous Message Jeff Janes 2012-05-25 03:20:34 Re: pg_dump and thousands of schemas

Browse pgsql-performance by date

  From Date Subject
Next Message Hugo <Nabble> 2012-05-25 04:54:05 Re: pg_dump and thousands of schemas
Previous Message Jeff Janes 2012-05-25 03:20:34 Re: pg_dump and thousands of schemas