Re: Backup/Restore of single table in multi TB database

From: "John Smith" <sodgodofall(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Backup/Restore of single table in multi TB database
Date: 2008-05-07 22:24:22
Message-ID: b88f0d670805071524g1e4965cbif9b48a822dba961f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Hi Tom,

Actually, I forgot to mention one more detail in my original post.
For the table that we're looking to backup, we also want to be able to
do incremental backups. pg_dump will cause the entire table to be
dumped out each time it is invoked.

With the pg_{start,stop}_backup approach, incremental backups could be
implemented by just rsync'ing the data files for example and applying
the incremental WALs. So if table foo didn't change very much since
the first backup, we would only need to rsync a small amount of data
plus the WALs to get an incremental backup for table foo.

Besides picking up data on unwanted tables from the WAL (e.g., bar
would appear in our recovered database even though we only wanted
foo), do you see any other problems with this pg_{start,stop}_backup
approach? Admittedly, it does seem a bit hacky.

Thanks,
- John

On Wed, May 7, 2008 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "John Smith" <sodgodofall(at)gmail(dot)com> writes:
> > After reading the documentation, it seems like the following might
> > work. Suppose the database has two tables foo and bar, and we're only
> > interested in backing up table foo:
>
> > 1. Call pg_start_backup
>
> > 2. Use the pg_class table in the catalog to get the data file names
> > for tables foo and bar.
>
> > 3. Copy the system files and the data file for foo. Skip the data file for bar.
>
> > 4. Call pg_stop_backup()
>
> > 5. Copy WAL files generated between 1. and 4. to another location.
>
> > Later, if we want to restore the database somewhere with just table
> > foo, we just use postgres's normal recovery mechanism and point it at
> > the files we backed up in 2. and the WAL files from 5.
>
> > Does anyone see a problem with this approach
>
> Yes: it will not work, not even a little bit, because the WAL files will
> contain updates for all the tables. You can't just not have the tables
> there during restore.
>
> Why are you not using pg_dump?
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Toby Chavez 2008-05-07 23:06:40 Custom Base Type in C
Previous Message Greg Smith 2008-05-07 22:07:07 Re: [GENERAL] pgbench not setting scale size correctly?

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2008-05-08 06:25:16 Re: Backup/Restore of single table in multi TB database
Previous Message Tom Lane 2008-05-07 21:41:01 Re: Backup/Restore of single table in multi TB database