Re: CLONE DATABASE (with copy on write?)

From: "Clark C(dot) Evans" <cce(at)clarkevans(dot)com>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: CLONE DATABASE (with copy on write?)
Date: 2011-11-13 15:07:40
Message-ID: 1321196860.2745.140660998306237@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sunday, November 13, 2011 7:33 AM, "Simon Riggs"
<simon(at)2ndQuadrant(dot)com> wrote:
> On Sat, Nov 12, 2011 at 9:40 PM, Clark C. Evans <cce(at)clarkevans(dot)com>
> > [We] should be using "CREATE DATABASE ... WITH TEMPLATE".
> > However, this has two big disadvantages.  First, it only works
> > if you can kick the users off the clone.  Secondly, it still
> > takes time, uses disk space, etc.  We have some big databases.
>
> An interesting proposal. Thanks for taking the time to raise this.

Thank you for responding Simon.

> The existing situation is that you need to either:
> 1) quiesce the database so it can be copied locally
> 2) take a hot backup to create a clone on another server
>
> (1) currently involves disconnection. Would a command to quiesce
> sessions without disconnection be useful? We could get sessions to
> sleep until woken after the copy. With large databases we would still
> need to copy while sessions sleep to ensure a consistent database
> after the copy.

Could their be a way to put the database in "read only" mode,
where it rejects all attempts to change database state with an
appropriate application level error message? We could then
update our application to behave appropriately while the copy
is being performed. Something like this could be broadly
useful in other contexts as well, for example, having a replica
that you brought up for reporting purposes.

Even so, the CREATE DATABASE... WITH TEMPLATE still has a set of
additional issues with it. It ties up the hard drive with activity
and then extra space while it duplicates data. Further, it causes
the shared memory cache to be split between the original and the
replica, this causes both databases to be much slower. Finally,
it creates a ton of WAL traffic (perhaps we could suspend this?)

> Is (2) a problem for you? In what way?

Due to our configuration, yes. Being able to CLONE the
database in the same cluster is much preferred. Our user
configuration, deliberately, does not involve hot backups.

Hot backups to another server won't work for us since our
servers are encrypted and isolated behind client firewalls.
Data that leaves the box has to be encrypted where the
decrypt key is only available upon hardware failure, etc.
Our upstream pipe isn't huge... which is why the WAL traffic
for backups is also problematic.

Perhaps we could create two PostgreSQL clusters on each server.
One of them would be production, the other would be for staging.
This involves some logistics... the advantage of this approach
is that we could limit resource usage on the slave and turn off
backups on it, reducing our disk usage and WAL traffic. We'd
keep shared memory on the slave to a minimum. This solution
still chews up 2x disk space and doubles the disk activity.

Could "WITH TEMPLATE" reach into another cluster's storage?

Best,

Clark

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Phoenix Kiula 2011-11-13 15:10:16 Re: Incremental backup with RSYNC or something?
Previous Message Andy Colson 2011-11-13 14:45:51 Re: Incremental backup with RSYNC or something?