Skip site navigation (1) Skip section navigation (2)

Re: Should pg_dump dump larger tables first?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Should pg_dump dump larger tables first?
Date: 2013-01-31 16:32:48
Message-ID: CAMkU=1xEMEeSqvLGUMcBjSX5Ag5KNqi0OBoC37+rf6+v33UuJA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Tue, Jan 29, 2013 at 3:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "David Rowley" <dgrowleyml(at)gmail(dot)com> writes:
>> If pg_dump was to still follow the dependencies of objects, would there be
>> any reason why it shouldn't backup larger tables first?
>
> Pretty much every single discussion/complaint about pg_dump's ordering
> choices has been about making its behavior more deterministic not less
> so.  So I can't imagine such a change would go over well with most folks.
>
> Also, it's far from obvious to me that "largest first" is the best rule
> anyhow; it's likely to be more complicated than that.

From my experience in the non-database world of processing many files
of greatly different sizes in parallel, sorting them so the largest
are scheduled first and smaller ones get "pack" around them is very
successful and very easy.

I agree that best rule surely is more complicated, but probably so
much so that it will never get implemented.

>
> But anyway, the right place to add this sort of consideration is in
> pg_restore --parallel, not pg_dump.

Yeah.

Cheers,

Jeff


In response to

pgsql-hackers by date

Next:From: Christopher BrowneDate: 2013-01-31 16:50:26
Subject: Re: Should pg_dump dump larger tables first?
Previous:From: Tom LaneDate: 2013-01-31 15:39:38
Subject: Re: Strange Windows problem, lock_timeout test request

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group