Re: Saving snapshots for later use

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Nikolas Everett <nik9000(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Saving snapshots for later use
Date: 2012-06-15 04:26:47
Message-ID: 4FDAB987.4000509@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.06.2012 06:19, Nikolas Everett wrote:
> I've been a PostgreSQL user for years and have a feature that I'd like to
> see implemented. I've started playing with the source and have a vague
> implementation plan for it, but before I go blundering around I'd like to
> run the idea by this list. So here it is:
>
> I'd like to be able to save the current snapshot and then at a later date
> roll the entire database back to that snapshot, essentially erasing
> everything that happened since the snapshot. Well, more like making all
> that stuff invisible. This would be really useful when testing
> applications who's databases take a while to setup and who's tests change
> the database in some way. It could also provide a way to recover from some
> horrible user error. Frankly, I'm more interested in this for testing, but
> having the ability to recover from errors would be nice.

Have you considered using filesystem snapshots?

> So I'm wondering a few things:
> 1. I _think_ all I have to do to store one of these snapshots is to store
> the current xid and a list of all in progress transactions. I believe I
> should be able to reconstitute this with the commit log to determine
> visibility at that snapshot.
> 2. I _think_ I can save that information in a system table. It feels
> weird to use a table to store visibility information but part of me thinks
> that is ok. I can't put my finger on why I think that is ok though.
> 3. I'm not really sure at all what to do with other connections that are
> doing things during a snapshot restore. They might just have to get cut.
> For my purposes this would be ok but I can immagine that would be a
> problem.
> 4. I think not allowing the user to save a snapshot during a transaction
> would simplify the book keeping a ton. I know what I typed in (1) wouldn't
> cover snapshots inside transactions. I just don't think that is worth the
> effort.
> 5. On a more personal note I have no idea if I am capable of implementing
> this. I'm really not very good with C but I haven't had too much trouble
> figuring out how to add new parser rules or map them into the tcop layer.
> I figured out how to create a new system table without too much trouble
> but haven't a clue how upgrades work for the new table. I'm not sure how
> to read from the system table but I see there is a faq entry for that.
> 6. I think it'd be ok if restoring an older snapshot removes a newer
> snapshot. Using the snapshot mechanism to jump forward in time just sounds
> hard to keep strait.

To revert the database to the earlier state, you'll also need to somehow
roll back all the already-committed transactions. At first sight, that
seems easy - just modify clog to mark them as aborted. However, it's not
that easy, because you'd also need to somehow clear hint bits that claim
those transactions to be committed. Or prevent those hint bits from
being set in the first place, but that affects performance, so I don't
think that would be very useful for testing. All in all, I think a
filesystem snapshot is simpler. Or even just use a copy of the database,
and use rsync to revert it back to the original state. You will have to
restart postgres, though.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-06-15 05:50:35 Re: libpq compression
Previous Message Nikolas Everett 2012-06-15 03:19:05 Saving snapshots for later use