Re: dump of 700 GB database

From: "karsten vennemann" <karsten(at)terragis(dot)net>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: dump of 700 GB database
Date: 2010-02-17 22:44:32
Message-ID: FD8D6786625F4631886FD43780ABA435@snuggie
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>>> vacuum should clean out the dead tuples, then cluster on any large tables that are bloated will sort them out without needing too much temporary space.

Yes ok am running a vacuum full on a large table (150GB) and will cluster the spatial data by zip code then. Understand that should get rid of any dead records and reclaim hard disk space then. The system I'm running it on is a 1.7 GB RAM Ubuntu jaunty machine, PostgreSQL 8.3.8.

I was hesitant to do any of this (vacuum, cluster, or dump and restore) because it might run days or weeks (hopefully not). Here are some of my PostgreSQL.conf settings in case this is not optimal and someone has a hint...
shared_buffers=160MB, effective_cache_size=1GB, maintenance_work_mem=500MB, wal_buffers=16MB, checkpoint_segments=100

Also I just set-up a new server (mirror of the other one I need to clean out) specifically for the purpose of running a database dump with enough storage space 2TB...So that is no issue right now
I really need to find out what is wrong with my procedure dumping the whole database as I never succeed yet to dump and restore such a bid db...
That will not be the least time I will have to do something similar.
Here is what I tried ("test" database is 350GB in size)

1. pg_dump -U postgres -Fc test > /ebsclean/testdb.dump
This gave me a dump of about 4GB in size (too smal in size even if its compressed ?) after running 5 hours (not bad I thought). But when I tried to restore it using pg_retore to another database (in a different table space)I got an error like "not an valid archive file" or something like that
So I was wondering if 4GB is a problem in Ubuntu OS ?

Thus I tried to split it during the dump operation
2. pg_dump -U postgres -Fc test | split -b 1000m - /ebsclean/testdb.dump
This gave me 5 files with a total combined size of about 4GB . But when I tried to restore it got the same error as above...
This dump and restore procedure should be the fastest (in respect to vacuum and/or cluster) from what I collected in an IRC session with some gurus some weeks ago.
Main question now is why is my dump /restore not working what am I doing wrong ?

Thanks
Karsten

_____

From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Pavel Stehule
Sent: Tuesday, February 09, 2010 23:30
To: karsten vennemann
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] dump of 700 GB database

Hello

2010/2/10 karsten vennemann <karsten(at)terragis(dot)net>

I have to write a 700 GB large database to a dump to clean out a lot of dead records on an Ubuntu server with postgres 8.3.8. What is the proper procedure to succeed with this - last time the dump stopped at 3.8 GB size I guess. Should I combine the -Fc option of pg_dump and and the split command ?

I thought something like
"pg_dump -Fc test | split -b 1000m - testdb.dump"
might work ?
Karsten

vacuum full doesn't work?

Regards
Pavel Stehule

Terra GIS LTD
Seattle, WA, USA

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2010-02-17 22:59:31 Re: dump of 700 GB database
Previous Message David Kerr 2010-02-17 22:16:28 Re: ERROR: permission denied to finish prepared transaction