Re: Updating large postgresql database with blobs

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: CAJ CAJ <pguser(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Updating large postgresql database with blobs
Date: 2007-03-12 15:22:24
Message-ID: 45F57030.1010704@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

CAJ CAJ wrote:
> Hello,
>
> I didn't get any response on the GENERAL list so i'm escalating this ....
>
> We have several independent database servers with ~50GB+ databases
> running postgres 8.0.x. We are planning to upgrade these databases to
> postgres 8.2.x over the weekend
>
> We plan to use the following steps to upgrade each server,
>
> 1. Dump the 8.0.x database cluster using 8.2.x pg_dumpall
> % ./pg_dumpall > pgdumpall_backup.sql
>
> 2.Dump the 8.0.x database including large objects in compressed
> custom format using 8.2.x pg_dump
> % ./pg_dump -Fc -b -Z9 dbname > pgdump_lobs_backup
>
>
> Restoring database
> 1. Initialize 8.2.x darabase
> % initdb -D /data/pgdata
>
> 2. Restore template1 database from cluster dump
> % ./psql -d template1 < pgdumpall_backup.sql
>
> 3. Delete database dbname else restoring will give error about
> existing dbname
> % dropdb dbname
>
> 4. Create fresh dbname
> % createdb -O dbowner dbname
>
> 5. Restore database with lobs
> % ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup
>
> Some of the problems we have are,
> 1. We are not sure if all of the data will be available after
> dump/restore with above process
> 2. The dump and restore process is very very slow to be complete over
> the weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghz with
> 1GB RAM and RAID 1 disks)
>
> What is the fastest way to upgrade postgres for large databases that
> has binary objects?
>
> Thanks for all your help.

Your procedure dumps and restore the databases twice. This seems less
than sound. My prediction is that you could get a 50% speed improvement
by fixing that ...

The only thing you really need pg_dumpall for is the global tables. I
would just use pg_dumpall -g to get those, and then use pg_dump -F c +
pg_restore for each actual database.

Another thing is to make sure that pg_dump/pg_restore are not competing
with postgres for access to the same disk(s). One way to do that is to
run them from a different machine - they don't have to be run on the
server machine - of course then the network can become a bottleneck, so
YMMV.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2007-03-12 15:42:40 Re: Synchronized Scan update
Previous Message Robert Treat 2007-03-12 15:13:51 Re: Auto creation of Partitions