Re: PITR potentially broken in 9.2

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: PITR potentially broken in 9.2
Date: 2012-12-05 02:05:15
Message-ID: CAMkU=1zo2j7bVfPDSgaGRffdkoZW2hNCRztDd2hLQSy+eKcpVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, Dec 4, 2012 at 4:20 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> I've reproduced it again using the just-tagged 9.2.2, and uploaded a
>> 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
>> drive. The data directory contains the recovery.conf which is set to
>> end recovery between the two critical time points.
>
> Hmmm ... I can reproduce this with current 9.2 branch tip. However,
> more or less by accident I first tried it with a 9.2-branch postmaster
> from a couple weeks ago, and it works as expected with that: the log
> output looks like
>
> LOG: restored log file "00000001000000000000001B" from archive
> LOG: restored log file "00000001000000000000001C" from archive
> LOG: restored log file "00000001000000000000001D" from archive
> LOG: database system is ready to accept read only connections
> LOG: recovery stopping before commit of transaction 305610, time 2012-12-02 15:08:54.000131-08
> LOG: recovery has paused
> HINT: Execute pg_xlog_replay_resume() to continue.
>
> and I can connect and do the pg_xlog_replay_resume() thing.

But the key is, the database was not actually consistent at that
point, and so opening hot standby was a dangerous thing to do.

The bug that allowed the database to open early (the original topic if
this email chain) was masking this secondary issue.

> So apparently this is something we broke since Nov 18. Don't know what
> yet --- any thoughts? Also, I am still not seeing what the connection
> is to the original report against 9.1.6.

The behavior that we both see in 9.2.2, where it waits for a
pg_xlog_replay_resume() that cannot be delivered because the database
is not yet open, is the same thing I'm seeing in 9.1.6. I'll see if I
can repeat it in 9.1.7 and post the tarball of the data directory.

Cheers,

Jeff

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2012-12-05 02:07:22 Re: PITR potentially broken in 9.2
Previous Message Andres Freund 2012-12-05 01:10:07 Re: PITR potentially broken in 9.2

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-12-05 02:07:22 Re: PITR potentially broken in 9.2
Previous Message Andres Freund 2012-12-05 01:10:07 Re: PITR potentially broken in 9.2