Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)

From: Greg Stark <stark(at)mit(dot)edu>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Farina <daniel(at)heroku(dot)com>, "Harold A(dot) Giménez" <harold(dot)gimenez(at)gmail(dot)com>
Subject: Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
Date: 2012-07-24 13:33:26
Message-ID: CAM-w4HMeTjbVKooUq_sEqjnEp+7fSZQj823XaE5geQ7-in4pdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Wed, Jul 18, 2012 at 1:13 AM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:

> That makes me wonder if on top of the buildfarm, extending some buildfarm
> machines into a "crashfarm" is needed:
>
> - Keep kvm instances with copy-on-write snapshot disks and the build env
> on them
> - Fire up the VM, do a build, and start the server
> - From outside the vm have the test controller connect to the server and
> start a test run
> - Hard-kill the OS instance at a random point in time.
>

For what it's worth you don't need to do a hard kill of the vm and start
over repeatedly to kill at different times. You could take a snapshot of
the disk storage and keep running. You could take many snapshots from a
single run. Each snapshot would represent the storage that would exist if
the machine had crashed at the point in time that the snapshot was taken.

You do want the snapshots to be taken using something outside the virtual
machine. Either the kvm storage layer or using lvm on the host. But not
using lvm on the guest virtual machine.

And yes, the hard part that always stopped me from looking at this was
having any way to test the correctness of the data.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-07-24 15:08:09 Re: isolation check takes a long time
Previous Message Tom Lane 2012-07-24 06:02:19 Re: isolation check takes a long time

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-07-24 14:23:01 Re: Using ctid column changes plan drastically
Previous Message Ioannis Anagnostopoulos 2012-07-24 13:22:34 Heavy inserts load wile querying...