Re: Duplicating data folder without tablespace, for read access

From: Jack Cushman <jcushman(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicating data folder without tablespace, for read access
Date: 2018-08-14 19:52:37
Message-ID: CAEv_OHWL+s=nyZ7K7jovw24-yCPo4XCFHrCEVoGvi4Q11BWDFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for such quick and helpful answers! My plan sounds probably better
to avoid, but if it turns out to be necessary, you all gave me some helpful
avenues and things to look out for.

Best,
Jack

On Tue, Aug 14, 2018 at 1:06 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> * Jack Cushman (jcushman(at)gmail(dot)com) wrote:
> > I have a large database of text, with a 600GB table and a 100GB table
> > connected by a join table. They both see occasional updates throughout
> the
> > week. Once a week I want to "cut a release," meaning I will clone just
> the
> > 100GB table and copy it to a "release" server for read-only web access.
>
> My general recommendation to people who are thinking about something
> like this is to use their restore-tests as a way to stage things (you
> are testing your backups by doing a restore, right?) and then copy over
> the results.
>
> > My procedure would be:
> >
> > - keep the 600GB table on a separate tablespace
> > - cleanly stop postgres on both servers
> > - copy the data folder to the release server
> > - delete pg_tblspc/* on the release server
> > - start postgres on both servers
>
> So, instead of that procedure, it'd be:
>
> - Back up the database as per usual
> - Restore the database somewhere
> - Run some sanity checks on the restored database
> - go in and drop the table and sanitize anything else necessary
> - Shut down the database and copy it into place
> - OR take a new backup of the sanitized database and then restore it
> into place
>
> Much cleaner, and tests your backup/restore process.
>
> Alternatively, you could just track changes to the "main" database using
> triggers into an audit log and then replay the changes made to the 100GB
> table into the other database.
>
> > In local testing this seems to work -- the release server works fine,
> and I
> > only get an error message if I try to access the missing tables, which is
> > expected. But are there reasons this is going to bite me if I try it in
> > production? I'm hoping it helps that (a) I'm only doing read access, (b)
> I
> > can cleanly stop both servers before cutting a release, and (c) I'm not
> > worried about losing data, since it's just an access copy.
>
> Still, it's a hacked up and not entirely proper PG database which will
> likely lead to confusion- maybe you won't be confused, but I strongly
> suspect others looking at it will be, and you might run into other
> issues too along the lines of what Tom mentioned (background jobs
> failing and such).
>
> Thanks!
>
> Stephen
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2018-08-14 19:59:19 Re: How to revoke privileged from PostgreSQL's superuser
Previous Message Bruce Momjian 2018-08-14 19:51:09 Re: How to revoke privileged from PostgreSQL's superuser