Re: PITR potentially broken in 9.2

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: PITR potentially broken in 9.2
Date: 2012-11-28 20:31:19
Message-ID: CAMkU=1wy4=t8_mt5k0c=SQY8kJOMcjVw-yN+EHSvFpakwWJj6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Nov 28, 2012 at 5:37 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 28.11.2012 15:26, Andres Freund wrote:
>>
>
>
>> Can you reproduce the issue? If so, can you give an exact guide? If not,
>> do you still have the datadir et al. from above?

Yes, it is reliable enough to be used for "git bisect"

rm /tmp/archivedir/0000000*
initdb
## edit postgresql.conf to set up archiving etc. and set
checkpoint_segments to 60
pg_ctl -D /tmp/data -l /tmp/data/logfile_master start -w
createdb
pgbench -i -s 10
pgbench -T 36000000 &
sleep 120
psql -c "SELECT pg_start_backup('label');"
cp -rp /tmp/data/ /tmp/data_slave
sleep 120
psql -c "SELECT pg_stop_backup();"
rm /tmp/data_slave/pg_xlog/0*
rm /tmp/data_slave/postmaster.*
rm /tmp/data_slave/logfile_master
cp src/backend/access/transam/recovery.conf.sample
/tmp/data_slave/recovery.conf
## edit /tmp/data_slave/recovery.conf to set up restore command and stop point.
cp -rpi /tmp/data_slave /tmp/data_slave2
pg_ctl -D /tmp/data_slave2/ start -o "--port=9876"

At some point, kill the pgbench:
pg_ctl -D /tmp/data stop -m fast

I run the master with fsync off, otherwise to takes to long to
accumulate archived log files.
The checkpoint associated with pg_start_backup takes ~2.5 minutes, so
pick a time that is 1.25 minutes before the time reported in the
backup history or backup_label file for the PITR end time.

I copy data_slave to data_slave2 so that I can try different things
without having to restart the whole process from the beginning.

> I just committed a fix for this, but if you can, it would still be nice if
> you could double-check that it now really works.

Thanks. In REL9_2_STABLE, it now correctly gives the "requested
recovery stop point is before consistent recovery point" error.

Also if the recovery is started with hot_standby=on and with no
recovery_target_time, in patched REL9_2_STABLE the database becomes
"ready to accept read only connections" at the appropriate time, once
the end-of-backup WAL has been replayed. In 9.2.0 and 9.2.1, it
instead opened for read only connections at the point that the
end-of-checkpoint record (the checkpoint associated with the
pg_start_backup) has replayed, which I think is too early.

Cheers,

Jeff

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2012-11-28 21:00:52 Re: PITR potentially broken in 9.2
Previous Message Tom Lane 2012-11-28 15:51:08 Re: PITR potentially broken in 9.2

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2012-11-28 20:33:34 Re: autovacuum truncate exclusive lock round two
Previous Message Kevin Grittner 2012-11-28 20:25:46 Re: Materialized views WIP patch