Re: pg_dump incredibly slow dumping a single schema from a large db

From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 17:17:06
Message-ID: CAE7Byhja_94DJdsngQd=E6nGEG82Q2O1cvYL_q3+id3ss31a2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Ok I just realized that's probably not going to be much help :)

0.00 0.00 0.00 5 0.00 0.00 canonicalize_path
0.00 0.00 0.00 5 0.00 0.00
trim_trailing_separator
0.00 0.00 0.00 3 0.00 0.00 strlcpy
0.00 0.00 0.00 2 0.00 0.00 join_path_components
0.00 0.00 0.00 2 0.00 0.00 last_dir_separator
0.00 0.00 0.00 1 0.00 0.00 find_my_exec
0.00 0.00 0.00 1 0.00 0.00 first_dir_separator
0.00 0.00 0.00 1 0.00 0.00 get_etc_path
0.00 0.00 0.00 1 0.00 0.00 get_progname
0.00 0.00 0.00 1 0.00 0.00 help
0.00 0.00 0.00 1 0.00 0.00 make_relative_path
0.00 0.00 0.00 1 0.00 0.00 resolve_symlinks
0.00 0.00 0.00 1 0.00 0.00 set_pglocale_pgservice
0.00 0.00 0.00 1 0.00 0.00 trim_directory
0.00 0.00 0.00 1 0.00 0.00 validate_exec

That's the output of gprof pg_dump gmon.out (I built the -pg build on my
dev box then ran it on the server. I'm just running the actual dump on my
dev box against the server instead to see if I get something more useful
since that doesn't really seem to have much data in it)

On Fri, Mar 30, 2012 at 11:09 AM, Mike Roest <mike(dot)roest(at)replicon(dot)com>wrote:

> Here's the gmon.out from a -pg compiled 9.1.1 pg_dump.
>
> --Mike
>
>
> On Fri, Mar 30, 2012 at 10:40 AM, Mike Roest <mike(dot)roest(at)replicon(dot)com>wrote:
>
>> For sure I'll work on that now. One thing I noticed looking through the
>> pg_dump code based on the messages and the code one thing I noticed it
>> seems to be grabbing the full dependency graph for the whole db rather then
>> limiting it by the schema (not sure if limiting this would be possible)
>>
>> This query returns 9843923 rows from the DB. So processing this seems
>> like it'll take quite a while.
>>
>> I'll get a -pg build of pg_dump going here on a dev box so I can get you
>> a profile.
>>
>>
>> On Fri, Mar 30, 2012 at 10:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
>>> > This dump is currently taking around 8 minutes. While dumping the
>>> pg_dump
>>> > process is using 100% of one core in the server (24 core machine).
>>> Doing a
>>> > -v pg_dump I found that the following stages are taking the majority
>>> of the
>>> > time
>>>
>>> > reading user_defined tables (2 minutes and 20 seconds)
>>> > reading dependency data (5 minutes and 30 seconds)
>>>
>>> Can you get an execution profile with oprofile or gprof or similar tool?
>>> It doesn't surprise me a lot that pg_dump might have some issues with
>>> large numbers of objects, but guessing which inefficiencies are hurting
>>> you is difficult without more info.
>>>
>>> regards, tom lane
>>>
>>
>>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-03-30 17:30:26 Re: pg_dump incredibly slow dumping a single schema from a large db
Previous Message Prashant Bharucha 2012-03-30 17:00:31 ERROR: invalid byte sequence for encoding "UTF8": 0xc325

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2012-03-30 17:21:11 Re: HTTP Frontend? (and a brief thought on materialized views)
Previous Message Peter Eisentraut 2012-03-30 17:16:59 Re: Uppercase tab completion keywords in psql?