Re: pg_dump additional options for performance

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_dump additional options for performance
Date: 2008-02-26 12:05:40
Message-ID: 1204027540.4252.251.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> Le mardi 26 février 2008, Simon Riggs a écrit :
> > So that would mean we would run an unload like this
> >
> > pg_dump --pre-schema-file=f1 --save-snapshot -snapshot-id=X
> > pg_dump -t bigtable --data-file=f2.1 --snapshot-id=X
> > pg_dump -t bigtable2 --data-file=f2.2 --snapshot-id=X
> > pg_dump -T bigtable -T bigtable2 --data-file=f2.3 --snapshot-id=X
>
> As a user I'd really prefer all of this to be much more transparent, and could
> well imagine the -Fc format to be some kind of TOC + zip of table data + post
> load instructions (organized per table), or something like this.
> In fact just what you described, all embedded in a single file.

If its in a single file then it won't perform as well as if its separate
files. We can put separate files on separate drives. We can begin
reloading one table while another is still unloading. The OS will
perform readahead for us on single files whereas on one file it will
look like random I/O. etc.

I'm not proposing we change things to use separate files in all cases.
Just when you want to use separate files, you can.

> And I'd much prefer it if this (new?) format was trustworthy enough to be the
> new default format of -Fc dumps. Then we could add some *simple* command line
> parameter to control the threading behavior of the dump and reload process,
> ala make -j. We could even support some option for the user to tell us which
> disk arrays to use for parallel dumping.
>
> pg_dump -j2 --dumpto=/mount/sda:/mount/sdb ... > mydb.dump
> pg_restore -j4 ... mydb.dump

I like the -j syntax.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Dunstan 2008-02-26 12:49:12 Re: pg_dump additional options for performance
Previous Message Dimitri Fontaine 2008-02-26 11:46:13 Re: pg_dump additional options for performance