Re: doubt with pg_dump and high concurrent used databases

From: "Peter Childs" <peterachilds(at)gmail(dot)com>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: doubt with pg_dump and high concurrent used databases
Date: 2007-11-25 19:15:17
Message-ID: a2de01dd0711251115l935fd31n8e7d1526fdb08097@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 25/11/2007, Erik Jones <erik(at)myemma(dot)com> wrote:
>
> On Nov 25, 2007, at 10:46 AM, Pablo Alcaraz wrote:
>
> > Hi all,
> >
> > I read that pg_dump can run while the database is being used and makes
> > "consistent backups".
> >
> > I have a huge and *heavy* selected, inserted and updated database.
> > Currently I have a cron task that disconnect the database users,
> > make a
> > backup using pg_dump and put the database online again. The problem
> > is,
> > now there are too much information and everyday the database store
> > more
> > and more data, the backup process needs more and more time to run
> > and I
> > am thinking about to do the backup using a process that let me to
> > do it
> > with the minimal interruptions for the users.
> >
> > I do not need a last second backup. I could the a backup with "almost
> > all" the data but I need the information on it to be coherent. For
> > example, if the backup store information about an invoice it *must* to
> > store both header and items invoice information. I could live if the
> > backup does not store some invoices information when is ran, because
> > they ll be backuped the next time the backup process run. But I can
> > not
> > store only a part of the invoices. That is I call a coherent backup.
> >
> > The best for me is that the cron tab does a concurrent backup with all
> > the information until the time it starts to run while the clients are
> > using the database. Example: if the cron launch the backup process at
> > 12:30 AM, the backup moust be builded with all the information *until*
> > 12:30AM. So if I need to restore it I get a database coherent with the
> > same information like it was at 12:30AM. it does not matter if the
> > process needs 4 hours to run.
> >
> > Does the pg_dump create this kind of "consistent backups"? Or do I
> > need
> > to do the backups using another program?
>
> Yes, that is exactly what pg_dump does.
>
>
Yes so long as you are using transactions correctly. Ie doing a begin before
each invoice and a commit afterwards if your not bothering and using auto
commit you *may* have problems. pg_dump will show a constant state at the
time when the backup was started. If your database was not "consistent" at
that time you may have issues, But it will be constant from a database
point of view ie foreign keys, primary keys, check constraints, triggers
etc.

It all depends what you mean by consistent.

Peter.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-11-25 19:35:27 Re: doubt with pg_dump and high concurrent used databases
Previous Message Erik Jones 2007-11-25 18:20:45 Re: doubt with pg_dump and high concurrent used databases