| From: | Mathieu Fenniak <mathieu(dot)fenniak(at)replicon(dot)com> |
|---|---|
| To: | pgsql-general(at)postgresql(dot)org |
| Subject: | fast-archiver tool, useful for pgsql DB backups |
| Date: | 2012-08-24 21:48:02 |
| Message-ID: | CAHoiPjw9GDk3ab==zPScZnSyWfvZWRQZab4-kaF8tsLW3WvDyg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Hi pgsql-general,
Has anyone else ever noticed how slow it can be to rsync or tar a pgdata
directory with hundreds of thousands or millions of files? I thought this
could be done faster with a bit of concurrency, so I wrote a little tool
called fast-archiver to do so. My employer (Replicon) has allowed me to
release this tool under an open source license, so I wanted to share it
with everyone.
fast-archiver is written in Go, and makes uses of Go's awesome concurrency
capabilities to read and write files in parallel. When you've got lots of
small files, this makes a big throughput improvement.
For a 90GB PostgreSQL database with over 2,000,000 data files,
fast-archiver can create an archive in 27 minutes, as compared to tar in
1hr 23 min.
Piped over an ssh connection, fast-archiver can transfer and write the same
dataset on a gigabit network in 1hr 20min, as compared to rsync taking 3hrs
for the same transfer.
fast-archiver is available at GitHub:
https://github.com/replicon/fast-archiver
I hope this is useful to others. :-)
Mathieu
$ time fast-archiver -c -o /dev/null /db/data
skipping symbolic link /db/data/pg_xlog
1008.92user 663.00system 27:38.27elapsed 100%CPU (0avgtext+0avgdata
24352maxresident)k
0inputs+0outputs (0major+1732minor)pagefaults 0swaps
$ time tar -cf - /db/data | cat > /dev/null
tar: Removing leading `/' from member names
tar: /db/data/base/16408/12445.2: file changed as we read it
tar: /db/data/base/16408/12464: file changed as we read it
32.68user 375.19system 1:23:23elapsed 8%CPU (0avgtext+0avgdata
81744maxresident)k
0inputs+0outputs (0major+5163minor)pagefaults 0swaps
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Rob Sargent | 2012-08-24 21:52:30 | Re: run function on server restart |
| Previous Message | John D. West | 2012-08-24 21:46:41 | Re: run function on server restart |